i\ 



32-Bit Microprosrammable Products 
Am29C300/29300 



1988 Data Book 



Advanced 

Micro 

Devices 




^ 



Advanced Micro Devices 

Am29C300/29300 
Data Book 



© 1988 Advanced Micro Devices 

Advanced Micro Devices reserves the right to make changes in its products without 
notice in order to innprove design or performance characteristics. 

This manual neither states nor implies any warranty of any kind, including but not 
limited to implied warranties of merchantability or fitness for a particular application. 
AMD assumes no responsibility for the use of any circuitry other than the circuitry 
embodied in an AMD product. 

The information in this publication is believed to be accurate in all respects at the 
time of publication, but is subject to change without notice. AMD assumes no 
responsibility for any errors or omissions, and disclaims responsibility for any 
consequences resulting from the use of the information included herein. Additionally, 
AMD assumes no responsibility for the functioning of undescribed features or 
parameters. 

901 Thompson Place, P.O. Box 3453, Sunnyvale, California 94088-3000 
(408) 732-2400 TWX: 910-339-9280 TELEX: 34-6306 



AMDASM, AmSYS, and IMOX are trademarks of Advanced Micro Devices, Incorporated. 

UNIX is a trademark of Bell Laboratories. 

VAX is a registered trademari< of Digital Equipment Corporation. 

Multibus is a registered trademark of Intel, Corporation. 

IBM is a registered trademark of International Business Machines, Incorporated. 

IBM-AT and IBM-PC are trademarks of International Business Machines, Incorporated. 

SmartModel is a registered trademark of Logic Automation, Incorporated. 

Symbolic Hardware Debugging is a trademark of Logic Automation, Incorporated. 

QuickSim is a trademark of Mentor Graphics. 

PAL is a registered trademark of Monolithic Memories, Incorporated. 

AUTOSTEP, Meta-Disassembler, MetaStep, QuickLearn, STEP-40 SDT, and User-Defined Symbolics are 
trademarks of Step Engineering. 



Thank you for your interest in the Am29C300/29300 family of microprogrammable products. This 
manual reflects our commitment to you to bring together all the essential ingredients for making your 
32-bit system design as smooth and straightfonivard as possible. 

This manual contains detailed product specifications, applications Information, software support 
products and third-party vendors, instruction definitions, and the latest reliability infonnation forboth 
CMOS and bipolar technologies. 

We have the quality, reliability and innovative products you need. Our worldwide hardware and 
software support teams of field applications engineers are ready to help you utilize our advanced 
microprogrammable products to complete your designs in a timely and cost-effective manner. 

It is with sincere appredation that we welcome you to the growl ng family of satisfied AM D customers . 
We lookfonward to serving your semiconductor needs and thank you forthe opportunity to contribute 
to your success. 




George Rigg 
Vice President 
Processor Products Division 
Advanced Micro Devices 



Preface 

Advanced Micro Devices is recognized as the pioneer and leader in microprogrammable "bit slice" 
integrated circuits. The Am29300 family sets the current standard in general purpose 32-bit building 
blocks. Designed for high performance and flexibility with a choice of elegant, easy to implement 
architectures, this chip set brings microprogrammable products into the next generation. 

The Am29300 generation gives the system designer flexibility both in hardware architecture and at 
the microprogram level. This 32-blt product family achieves high performance and high integration, 
while avoiding architectural restrictions. The products are designed to meet the high computational 
requirements of advanced graphics systems, image processing, high-end controllers, fault-tolerant 
processors, work stations, and other 32-bit applications limited not by process technology, but only 
by the designer's imagination. 

Chapters 2, 3, and 4of this databook describe the current full range of the Am29300 product offerings 
in bipolar and CMOS technologies. Three different types of data sheets are presented: Advanced 
Information, Preliminary, and Final. 

• Advanced Information data sheets are developed from simulation data after 
circuit design Is completed. After a process change, advanced information is 
again provided for speed select data. 

• Preliminary data sheets are based on actual measurements when silicon is 
available and units have been tested for AC characteristics. The preliminary test 
programs are in place, but the normal fabrication process variations have not 
allowed setting of final AC limits. 

• Final data-sheet status is applied to products that are fully characterized over 
the operating range and are in volume production. 

Over 75 application notes and technical articles have been written in 11 different languages 
describing the features and benefits of the Am29300/29C300 family. A few representative articles 
are reprinted in Chapter 6 to serve as a starting point for readers less familiar with the broad scope 
of this chip set. A full list of articles is offered in the bibliography of Chapter 6. 

Technical information regarding product and process reliability, as well as the Advanced Micro 
Devices model for reliability studies is provided in Chapter 7. This chapter also outlines the basic 
thermal characteristic data for the bipolar Am29300 products and describes test philosophy and 
methods. 

Chapter 8 gives general information regarding package outlines and ordering information. 
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1.1 Am29300/29C300 GENERAL OVERVIEW 

CMOS and Bipolar 32-Blt High Performance 
Building Blocks 

AMD's Am29300/29C300 family has been developed to 
provide systems designers with flexible, off-the-shelf, 
high-performance, 32-bit microprogrammable building 
blocks. The Am29300/29C300 family is ideal for com- 
plex and calculation-intensive applications such as intel- 
ligent peripheral controllers Including graphics, telecom- 
munications, switching systems and laser printers; arfifi- 
cial intelligence and RISC CPUs; array and digital signal 
processing; and a multitude of military applications. 

Am29300/29C300 Pushes the Limits of 
Your Imagination 

Flexibility of Design 

Success Is drivenby innovation and differentiation. While 
"me too" systems companies merely struggle to be the 
lowest cost manufacturers, innovative companies strive 
ahead toward the future. The designers of AMD's 32-bit 
family recognize the need for system innovation and 
differentiation. The Am29300/29C300 family provides 
powerful building blocks with unlimited architectural flexi- 
bility, thus returning design innovation and value-added 
back to the design engineer. With the flexibility of custom 
architectures and custom microcode, system perform- 
ance is limited only by imagination. 

Improve Your Time to Market 

Because AMD's 32-bit family integrates high perform- 
ance features such as master/slave, parity checking. 



funnel shifters, priority encoders, and mask generators, 
the Am29300/29C300 family meets the complex func- 
tional requirements of sophisticated systems and can 
eliminate the need for custom ICs. With the Am29300/ 
29C300 there are no engineering circuit turnaround 
delays, no hidden Non-Recurring-Engineering costs, no 
complex test engineering correlations, and no waiting. 
Off-the-shelf availability of a highly Integrated, fully 
tested product of guaranteed quality can mean improved 
profits for the system application. 

Specifications ttiat Count 

We provide you with the tools and data necessary to 
make your design right the first time. You can be assured 
that the specifications of the parts you order are guaran- 
teed by AMD as printed in the data sheets. Designers 
require worst case guaranteed parameter values, and 
AMD provides them. AMD removes the uncertainty of 
customized design with fully guaranteed, standard, off- 
the-shelf, 32-bit products. These state-of-the-art bipolar 
and CMOS building blocks are the ideal solution for 32- 
bit applications. 

Military Product Position 

AMD Is committed to support the industry with military 
qualified and specified Am29C300 family products. The 
entire family is being offered as 883C level B fully 
compliant APL products. In addition, we plan to release 
thefamily in DESC military drawings. This will providethe 
user with alternatives to source control drawings, thus 
saving cost and time. 



1-1 



CHAPTER 1 

Am29300/29C300 Family Overview 



Manufacturing - Processes and Planning 

AMD's Commitment to Process Technology 
Improvements 

The Am2901 industry standard bit-slice ALU is an ideal 
example of AMD's commitment to process improve- 
ments. Table 1-1 and Figure 1-1 demonstrate the per- 



formance improvements of the Am2901 . Since its intro- 
duction, the Am2901's performance has increased 
nearly three-fold while its price has dropped by afactorof 
ten. This represents 25 percent annual price/perform- 
ance improvement over 12 years. The philosophy of 
performance improvements through process technolo- 
gies applies to all members of AMD's microprogram- 
mable products. 



Table 1-1 






Year 


Device 


Technology 


Die Size 


Speed 
A,B ^ G,P 


Power 








1975 


Am2901 


Low-Power Schottky 


33 K miP 


80 ns 


1.5W 








1977 


Am2901A 


Dual Layer Metal 
Ion Implantation 


20 K miP 


65 ns 


1.5 W/ 








1978 


Am2901 B 


Projection Printing 


ISKmil^ 


50 ns 


1.5W 








1981 


Am2901C 


ECL Internal 
TTL, I/O IMOX 


ISKmil^ 


37 ns 


1.5W 








1986 


Am29C01 


1.6 |im CMOS 


15Kmil= 


37 ns 


0.5 W 








1987 


Am29C01-1 


1.2 M-m CMOS 
Speed Select 


ISKmiP 


28 ns 


0.5 W 








1987 


Am29C01-2 


1 .0 urn CMOS 


ISKmiP 


19 ns 


0.5 W 
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Figure 1-1. Am290l Performance 



Figure 1-2. Am29300/29C300 Performance 
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Bipolar VLSI 

The Am29300 family contains some of the largest bipo- 
lar ICs produced anywhere in the world. For example, the 
Am29332 has over 5,000 gales, 31,000 devices, and 
measures 142,000 rnils^. AMD's IMOX S-2 process 
allows for such integration and high performance. Future 
advances in AlWD's bipolar process will include process 
"tweaks" as well as total changes in process approach. 
These advances will provide improved performance and 
yields, directly affecting the price/performance of the 
Am29300 family. 

CMOS VLSI 

The Am29C300 family, like its bipolar counterpart, also 
contains very large die. The Am29C325 encompasses 
nearly 1 1 ,000 gates and measures almost 130,000 mils^. 

AMD's CS-1 1 is the current CMOS workhorse process 
f orthe Am29C300 family. At an effective channel width of 
1 .6 microns, CS-1 1 is capable of approaching the bipolar 
speeds on all specifications. 

There will be continued process improvements to the 
current CMOS technology. The first improvement, 
CS-11 A, will be available on all Am29C300 products in 
04 1 987. CS-1 1 A has an effective channel width of 1 .2 
microns, resulting in a 25 percent performance improve- 
ment over CS-1 1 . 

Table 1 -2 demonstrates the performance improvements 
expected on the Am29C300 family as these processes 
are incorporated into the family. 



Table 1-2 CMOS Evolution 


Year 


Process 


Effective 
Channel Length 


Typical 
Gate Delay 


1986 
1987 
1988 


CS-11 
CS-11 A 
CS-21 


1 .6 micron 
1 .2 micron 
1 .0 micron 


1 .25 ns 
0.90 ns 
0.65 ns 



The Philosophy Behind the Functionality 

When AMD introduced the 4-bit slice (memory plus ALU) 
Am2901 in 1975, semiconductor and packaging tech- 
nologies prevented the integration of a 1 6- or 32-bit unit. 
The 4-bit slice with internal memory and external carry- 



look-ahead and a 48-pin package were the right compro- 
mise then. Today, semiconductor and packaging tech- 
nologies have advanced to apoint where afull 32-bit ALU 
with many non-sliceable features, internal carry-look- 
ahead, and systems access to all buses can be put on 
one chip, with expandable memory on another. This 
results in higher versatility and higher performance. 

There are several reasons for the choice of a wider data 
path. First, cycle time is improved significantly if carry 
lookahead is contained entirely on the chip. Second, 
certain powerful on-chip functions, such as the funnel 
shifter, priority encoder, and mask generator are ex- 
tremely difficult to "slice." Third, a higher level of integra- 
tion leads to a more cost-effective system solution. 
These and other advantages contributed to the decision 
to make the Am29332/29C332 a complete 32-bit func- 
tion rather than a slice. 

The Am29300/29C300 philosophy has also removed 
the register file from the ALU, providing the designer 
greater system flexibility and making expansion and 
regular addressing much easier. The new partitioning 
results In a number of benefits. The user gets a func- 
tionally more powerful processor with two uncommitted 
input buses and gains the flexibility of adding storage 
elements to those buses. The Am29300/29C300 family 
is designed to be the most functional and powerful family 
of microprogrammable building block products available 
on the market. 



1.2 Am29300/29C300 FAMILY DEVICE 
OVERVIEW 

The Am29332/29C332 32-Bit ALU - The 
Heart of a New Generation of Machines 

The Am29332/29C332 is AMD's first 32 bit wide ALU. 
Parallel processing of 32 bits of data, coupled with very 
fast cycle time, provides throughput unprecedented in 
VLSI-based systems. 

The 32-bit ALU combines maximum performance and 
integration by keeping all critical timing paths short and 
balanced. All ALU instmctions have the same short cycle 
time. This includes barrel shifting, normalization, priority 
encoding and field logical operations. 



1-3 



CHAPTER 1 

Am29300/29C300 Family Overview 







Width 
Input 



Position 
Input I — ^ 



Da 

g 






32 



Parity 
Checl^er 



32 



-Status Register 



LkJ. 



Source Multiplexer 



64'' 



Masl< 
Generator 



- Funnel ' 1 

n Shifter | 

"-7** Pos jj V32 i^32 
1 < \ ALU & Priority Ei 

; i I ^ — — 

' ' Up/Down Shif 

MUX I I 



"V 



^32 



M V S 

ALU & Priority Encoder 



Status 
Register 



Status 
MUX 



Z 



32 



1 £ 



Q. Register/ 
' Shifter 



^^32 



Parity 
Generator 



V- 



^ 



32 



STATUS 



PY 



Figure 1-3. Am29332/29C332 32-Bit ALU 



Three Ports Facilitate High Throughput 

Tlie Am29332/29C332 hias two input ports (A and B) and 
an output port (Y), all 32 bits wide. These three ports 
provide flexibility and accessibility tor high-performance 
processor designs. Dedicated input and output ports 
provide aflow-through architecture and avoid the penalty 
associated with switching a bidirectional bus halfway 
through the cycle. In addition, the three-bus architecture 
allows easy parallel connectton of other arithmetic units 
for even higher performance. 

Arithmetic and Logic Unit 

The 32-bit wide ALU in the Am29332/29C332 has full 
carry-lool<ahead to improve cycle time for all arithmetic 
operations. The ALU is a unique three-input structure 
with two data input ports and a mask input that is used on 
every cycle, thus providing very powerful instructions 



that execute in a single cycle. The mask supports byte- 
aligned arithmetic operations and field logical operations 
on variable-position, variable-length fields. The byte- 
aligned arithmetic operations use 8-, 16-, 24-, and 32-bit 
LSB-aligned operands. Field-logical instnjctions operate 
on operands of arbitrary length and starting position. 

Priority Erjcoder 

The priority encoder generates a 5-bit vector indicating 
the highest order 'one' in the 32-bit operand. These 5 bits 
are then stored in the position field of the status register 
for use during the next cycle. The priority encoder sup- 
ports all byte-aligned data types; the result is dependent 
upon the byte width specified. This function supports 
normalization necessary for floating point operations; it 
also enhances certain graphics primitives. 
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64-Bit Funnel Shifter 

The on-board 64-bit input, 32-bit output funnel shifter is 
much more than a conventional barrel shifter. The shifter 
can extract any contiguous field of 32 bits from a 64-bit 
input. This input may consist of concatenated A and B 
input words or, for barrel shifting, duplicated A or B input 
words. 

Residing in the ALU data path, the shifter can perform n- 
bit shift or rotate in conjunction with a logical ALU 
operation-all in the same cycle, without increasing the 
length of the cycle. This capability affords single-cycle 
execution of logical operations beween unaligned fields 
- a function that would take multiple cycles in other 
architectures. 

Masl( Generator 

The power and flexibility of the processor stems partly 
from its ability to generate a mask to control the width of 
an operation for each instaiction without any cycle time 
penalty. The mask generator at the ALU input creates a 
contiguous field of ones and contains its own shifter to 
position this control field anywhere in the data path. The 
mask generatorcan also be used as a pattern generator, 
bypassing the mask throifgh the ALU. 

Status Register 

The processor has a 32-bit wide status register that 
contains: information on position and width of the oper- ■ 
and; the ALU status flags Carry, Negative, Overflow, and 
Zero; status bits for evaluation of inequalities; a link bitfor 
multipreclsion shifts; an M flag for high speed multiply 
and divide; and intermediate nibble carries for BCD 
arithmetic. An extract-status instruction is provided that 
allows any bit from the status register to be output at the 
Y-port. This is particularly useful in machines employing 
stack architectures. Instructions to save and restore the 
status register are also provided. 

Multiply and Divide Support 

The chip incorporates dedicated hardware to allow effi- 
cient Implementation of multiply and divide algorithms for 



both unsigned and signed arithmetic data types. The 
modified Booth multiply algorithm processes two bits 
per cycle. The four-quadrant, non-restoring divide algo- 
rithm processes one bit per cycle. Since the data path 
width is fixed at 32 bits, the instructions can be simplified 
to provide "first step," "iterate step" and "last step" com- 
mands for both multiply and divide. Programming slices 
is no longer necessary since all multiply and divide steps 
are provided in the instruction set. For business-oriented 
machines, the ALU is capable of performing BCD arith- 
metic on packed BCD numbers. In order to keep non- 
BCD operations fast, BCD arithmetic is executed by 
binary arithmetic followed by BCD correction. 

The Instruction Set: Powerful and Flexible 
Yet Simple and Regular 

The Am29332/29C332 instruction set complements the 
powerful hardware. To ease the task of code generation, 
the instruction set is symmetrical and regular. There are 
two large classes of instructions. The first class handles 
byte-aligned data (8-, 1 6-, 24-, or 32-bit LSB-aligned). It 
is comprised of: data movement instructions; arithmetic 
instructions, including multiply and divide steps and BCD 
instructions; logical instmctions; and single-bit shift and 
prioritize operations. The second class of instmctions 
operates on variable-length, variable-position fields. It 
includes N-bit shift and rotate, field extract, and field 
logical operations. 

The Am29331/29C331 - 16-Bit 
Micro-Interruptible Sequencer 

The Am29331/29C331 is a high speed sequencer con- 
trolling the sequence of microinstructions stored in mi- 
croprogram memory. The instruction set aids structured 
microprogramming and handles sequential execution, 
branches, subroutines and loops. The sequencer in- 
structions may be unconditional or conditional based on 
CPU status, an on-board 8-input test multiplexer, and a 
polarity control. The sequencerhas a 16-bit wide address 
path and can thus access 64K words of microcode 
memory. It is transparently interruptible at any microin- 
struction tioundary. 
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Figure 1-4. Am29331/29C331 16-Bit Microinterruptlble Sequencer 



Balanced Timing Means Greater Throughput 

In previous generation microprogrammed systems, the 
control path containing the sequencerhas often been the 
bottleneck, because the sequencers were slower than 
the associated data paths. Not so in the Am29300/ 
29C300 family. The speed of the Am29331/29C331 
sequencer has been designed such that the entire sys- 
tem timing is balanced between the control path and data 
path, leading to higher overall throughput. 

Micro-Level Interruptible 

Real time interrupt handling at the microinstruction level 
is made possible by the Interrupt return address register 
and the bidirectional Y-port. While the interrupt address 
enters the part through the Y-port, the interrupt return 
address is saved on the stack. Nested interrupts are 
handled the same way. 



Built-in Trap Handling 

As an architectural alternative to the intermpt-driven 
approach, the Am29331/29C331 Sequencer also has 
provision for handling "traps" transparently at the micro- 
instruction level, upon the occurrence of specified sys- 
tem events. In this mode, the current microinstruction is 
aborted. The specified trap routine is executed (like an 
interrupt). But, following the trap routine, the aborted 
microinstaiction is re-executed (instead of proceeding on 
to the next microinstruction, as in an interaipt). 

33-Level Stacl( 

The 33-level stack provides sufficient depth to handle 
nested loops and subroutines; it is also used to save the 
status of the sequencer when handling interrupts. Since 
the stack is externally accessible. Its contents may be 
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unloaded through the bidirectionai D-port for diagnostic, 
debugging or fault recovery puiposes. The stack may 
also be loaded from the outside through the D-port. This 
may be used for context switching, for example. 

Multitasking Support 

By providing a HOLD control pin, the designer may use 
multiple sequencers in a multitasking system, with only 
one sequencer active at any orie time. The output Y-ports 
of the sequencers are tied together to address the same 
microcode memory. This is useful, for example, for rapid 
context switching at the microinstruction level. 

Address Comparator Eases Debugging 

The sequencer compares the address on the Y-port with 
the contents of an internal break-point register. Break- 
point detection is useful for debugging the system or 
gathering run-time statistics. 

Two-Brancti Address Inputs 

Two separate branch address inputs, D and A, are 
provided to speed up source address selection. Both A 
and D ports can be used to load the counter. The D port 
can also be used to load or unload the stack while the A 
port may be used to input a branch or map address, 
eliminating the need to three-state selected sources. 

Built-in Test Generation Logic 

In the Am29331/29C331, unlike previous sequencers, 
test generation logic and one layer of condition test 
multiplexer logic are built-in. This not only reduces 
component count, but also improves cycle time by mini- 
mizing inter-chip delays and by moving the multiplexer 
into fast internal ECL gates. 

Multiway 

Four sets of four-bit multiway inputs are provided. Each 
such set of 4 bits can replace the four least significant bits 
of D input, allowing a direct branch to any of 1 6 consecu- 
tive locations in the microprogram memory. The multi- 
way capability allows checking of uptofoursimultaneous 
test conditions in a single cycle. This is obviously an 
attractive alternative to checking each test condition 
serially, a much slower multicycle process. 

The Most Versatile Sequencer Ever 

The combination of 1 6 bits of address, real time interrupt 
capability, two address ports, a deep stack and other 



capabilities make this device the niost feature-loaded 
sequencer ever offered. 

The Am29334/29C334 Register File 

The Am29334/29C334 is a 64 word by 18 bit, dual- 
access, four-port register file. It is deliberately separate 
from the ALU to allow easy, regular expansion, both 
horizontally for wide data paths and vertically for large 
register file machines. 

Four-Port Achitecture 

Two Read and two Write data ports allow independent 
and simultaneous access to two register file locations. 
The Read and Write ports are separated to eliminate the 
delay caused by turn-around of bidirectional buses. The 
dual-address, four-port architecture allows any combina- 
tion of two reads, writes, or read-writes - no restrictions. 

Organization Supports Parity 

Since the Am29334/29C334 has a by-1 8 organization, it 
can store two bytes with parity in each of its 64 words. As 
a data path storage element, the register file neither 
generates nor checks parity. When used in conjunction 
with the Am29332/29C332 processor (which provides 
parity checking on its inputs and parity generation on its 
output), it provides a bus compatible register file, thus 
extending parity protection to the entire data path loop. 

Array Processing Products/ Aritlimetic 
Accelerators 

The Am29300/29C300 family is capable of very fast 
operation on 32-bit fixed-point numbers. When greater 
dynamic range is necessary, floating-point numbers 
are often chosen. Advanced Micro Devices offers high- 
speed VLSI integrated circuits designed to support the 
growing need for high-performance an-ay and signal 
processing. Applications include graphics, image 
processing, communications, medical instrumentation, 
radar and other electronic warfare applications. Three 
AMD devices address these needs: Am29325/29C325 
32-bit Floating-Point Processor, Am29C323 32x32-bit 
Multiprecision Multiplier, and Am29C327 64-bit Float- 
ing-Point Processor. These devices achieve very high 
speeds through a combination of innovative architec- 
ture and AMD's advanced bipolar IMOX process and 
CMOS process. 
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Am29325/29C325 



The Am29325/29C325 is a high-speed, single precision 
floating-point processor. It performs 32-bit floating-point 
addition, subtraction and multiplication operations in a 
single device, using either IEEE-P754, draft 10.0 or 
DEC VAX format. 

Single-Cycle Execution 

Since performance is the objective, all 
instnjctions-including multiply-require only one cycle to 
execute. 

Wo Mandatory Pipelining 

Although the Am29325/29C325 FPP has input and out- 
put registers to make it a general purpose accelerator, 
there are no pipeline registers i nternal to the floating poi nt 
array. Even the I/O registers can be made transparent. 

Three-Bus Architecture 

The Am29325/29C325, like the Am29332/29C332, has 
a three-bus architecture, with two input buses and one 
output bus, thereby providing a bus compatible accelera- 
tor. This configuration provides high I/O bandwidth allow- 
ing the user to take full advantage of the single cycle, 
high-speed, floating-point ALU. Naturally, the input and 
output registers may be made transparent with individual 
clock enables. In addition, the input and output registers 
may be made transparent with independent feed- 



through controls. The rules remain consistent - the 
system architecture achieves the highest performance 
when the component architectures do not interfere. 

Powerful Instruction Set 

The Am29325/29C325 executes the following instruc- 
tions: 

• Add (RplusS) 

• Subtract (R minus S) 

• Multiply (R times S) 

• Constant Subtract (2 minus S) 

• Integer to Floating Point Conversion 

• Floating Point to Integer Conversion 

• IEEE to DEC Format Conversion 

• DEC to IEEE Format Conversion 

The instruction (2 minus S) is provided to support the 
Newton-Raphson division algorithm. 

Internal Data Paths Support Accumulation 

The Am29325/29C325 has two internal feedback paths 
to facilitate two-cycle internal mulliply-accumulate op- 
eration. The F1 bus can store the results of the multiply 
operation in an input register for subsequent accumula- 
tion. The F2 bus lets the output register function as an 
accumulator by making Its output available as an oper- 
and for the next cycle. 
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Am29C325 Stand-Alone Perfomiance 

The Am29C325 is a stand-alone CMOS Floating Point 
Processor. When used with a simple sequencer such as 
the Am29C1 OA, it can be used as a low cost floating-point 
engine for applications requiring iterative algorithms 
such as Chebyshev and Newton-Raphson. These algo- 
rithms are used extensively in guidance, image and 
signal processing, and other DSP applications. 

Programmable I/O Structure 

To provide compatability with different system buses, 
controls are provided for the following options: 

• Two 32-bit input buses and one 32-bit output bus 

• One 32-bit input bus and one 32-bit output bus 

• Two 16-bit input buses and one 16-bit output bus 

The input modes affect only the manner in which 
operands are entered into the device. The operation 



of the floating-point ALU is not altered. For example, 
in the 32-bit/one Input-bus mode, the two 32-bit inputs 
are tied together and the two input operands are 
clocked Into the input registers on alternate rising and 
falling edges of the clock. In the 1 6-bit, 3-bus mode, the 
32-bit operands are delivered on two consecutive clock 
cycles In 16-bit Increments. 

Am29C327 Double-Precision 
Floating-Point Processor 

The Am29C327 double-precision floating-point proces- 
sor is a high perfomiance, single VLSI device that imple- 
ments an extensive floating-point and integer Instmction 
set. It can perform operations on single-, double-, or 
mixed-precision operands. The three most popular float- 
ing-point formats- IEEE, DEC, and IBM -are supported. 
IEEE operations comply with the standard P754, with 
direct implementation of special features such as gradual 
underflow and trap handling. 
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Flow-Through or Pipelined 

Operations can be performed in either of two modes: 
flow-through or pipelined. In the flow-through mode, the 
ALU is completely combinatorial; this mode is best suited 
for scalar operations. Pipelined mode divides the ALU 
into one or two pipelined stages for use in vector opera- 
tions, as is often found In graphics or signal processing. 

Three-Bus Architecture 

The Am29G327 has two input buses and one output bus 
- a three-bus architecture just like the Am29C325 float- 
ing-point processor. It provides flexibility and ease of 
interface, making it a very high performance accelerator. 

Input/Output Modes 

The Am29C327 supports eight I/O modes which provide 
a flexible interface to a variety of 32-bit and 64-bit 
systems. The input buses can be configured as separate 
32-bit Input buses or as a single 64-bit input bus. It is 
possible to load two 64-bit operands in a single clock 
cycle. The input modes are: 

32-bit, double-cycle, LSWs first 

32-bit, double-cycle, MSWs first 

32-bit, single-cycle, LSWs first 

32-bit, single-cycle, MSWs first 

64-bit, double-cycle, R first 

64-bit, double-cycle, S first 

64-bit, single-cycle, R first 

64-bit, single-cycle, S first 

Integer or Floating-Point 

In addition to supporting 32-bit and 64-blt integer opera- 
tions, the Am29C327 supports the following floating- 
point formats in single- or double-precision: 

IEEE P754 version 10.1 

DEC F, DEC D, and DEC G formats 

IBM system 370 format. 

Conversion between the floating-point formats and con- 
version between floating-point and integer formats are 
also provided. This is a very powerful feature not avail- 
able in any other architecture. 

Mixed-Precision Operations 

All Am29C327 instructions, floating-point or Integer, 
can be performed in either single- ordouble-precision op- 
erands. In addition, the user can elect to mix precisions 
within an operation. All operations are internally per- 
formed in double precision; the user specifies the de- 
sired precision of the input and output operands. The 



necessary precision conversions are made in concert 
with the selected operation, with no additional cycle-time 
overhead. 

Register File and Internal Datapath Support 
Compound Operations 

The ALU of the Am29C327 has three data input ports and 
can perform operations of the form (A*B)+C. An eight- 
deep register file for storing immediate results used in 
recursive operations, and the on-chip 64-bit datapath, 
facilitates compound operations such as Newton-Ra- 
phson division, sum-of-products, and transcendentals. 

Comprehensive Floating-Point and Integer 
Instruction Sets 

The Am29C327 implements an extensive number of 
arithmetic and logical instructions. These instoictions fall 
into the following categories: 

addition/subtraction 

multiplication 

multiplication/ accumulation 

comparison 

max/min 

saturation (clipping) 

rounding to integral value 

absolute value, negation 

reciprocal seed generation 

floating-point < — > floating-point conversion 

floating-point < — > integer conversion 

integer< — > integer conversion 

pass operand 

logical operations; e.g. AND, OR, XOR, NOT 

move data 

By concatenating these operations, the user can also 
perform division, square-root extraction, polynomial 
evaluation, and otherfunctions not implemented directly. 

Am29C323 Multiplier 

The Am29C323 Is a high-speed parallel 32x32-bit multi- 
plier designed to speed up systems using fixed or float- 
ing-point notation. 

Three-Bus Architecture 

Just like other members of the family, the Am29C323 has 
two input buses and one output bus. This configuration 
provides high I/O bandwidth, allowing the user to take full 
advantage of the high-speed parallel multiplier core of 
the device. 
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Figure 1-10. Am29C323 32x32 Parallel Multiplier 



Multiprecision Multiplication Made Easy 

By including 32-bit shift and accumulate to generate 
partial products, the internal architecture of the 
Am29C323 supports fast multiprecision multiplication. 
Both input ports have dual 32-bit registers, and the output 
port can select from a 67-bit product register, a 32-bit 
temporary register, or directly from the 32x32-bit multi- 
plier array. A complete 32x32-bi1 clocked multiplication 
takes a single cycle (naturally - and with no pipelining!). 
Multiprecision multiplication uses the shift and accumu- 
late logic to collect partial products starting with the least 
significant product. The number of cycles depends upon 
the input data width, with three-cycle latency, as shown 
in the table below. By using the I/O registers for pipelin- 



ing, much greater throughput can be achieved. For 
example, by overlapping 64x64-bit operations, af ull 1 28- 
bit product is available every four cycles. Multiplying the 
mantissas of two double-precision 64-bit floating-point 
numbers, for example, is one possible application of this 
high speed multiprecision multiplication capacity. 



Number of Cycles 



Operands 



Single 
Product 



Overlapped 
Operations 



32x32 


1 


1 


64x64 


7 


4 


96x96 


12 


9 


128x123 


19 


16 
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Registered Buses 

All buses in the device are registered, and each register 
has its own Clocl< Enable. The device operates.from a 
single clock, Ideal for microprogrammed systems. All 
ports - Input, output, and Instruction - can be made 
transparent Independently. 

Complete Interlocking Fault Detection 

To enhance system reliability by ensuring data Integrity 
and correct hardware operation, the family supports both 
master/slave fault detection and data path parity. The 
system features byte parity checking on the inputs and 
byte parity generation on the outputs of the Am29332/ 
29C332 ALU and Am29C323 32x32-bit multiplier. Also, 
the organization of the Am29334/29C334 64x18 register 
file accommodates parity bits for each byte. The parity 
mechanism assures data path integrity. Major functional 
blocks-Am29332/29C332 ALU, Am29331/29C331 
sequencer, Am29C323 32x32 bit multiplier, and 



Am29C327 64-bit floating-point processor-have "mas- 
ter/slave fault detection" to ensure correct operation 
without having to carry parity through complex Internal 
logic (shifters, mask generators, etc.) and without having 
to pay the resulting delay penalties. In master/slave 
mode, two functional units are connected in parallel 
with one unit doing the actual operation and the other 
checking the result, on a cycle-by-cycle, bit-by-bit basis. 
The master is used forthe normal data path. Inthe slave, 
however, all outputs become inputs, and the slave com- 
pares the outputs of the master with Its own internally 
generated result. If the two don't match, an error signal 
is generated, triggering an Interrupt at the microin- 
struction level. No specialized software Is required for 
the master/slave scheme. Also, the designer can choose 
to impose redundancy at the component or board level. 
The parity mechanism and the master/slave concept, 
which use cost-effective hardware ratherthan expensive 
software, provide a comprehensive solution for fault 
tolerant systems. 
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Am29337 16-Bit Bounds Checker 

The need for simple yet sopliisticated functionality and 
board space savings created tfie Am29337, a 16-bit 
bounds checker. This product provides inexpensive, 
easy-to-use solutions fro the following applications: 

• intelligent address decoder 

• window clipping in graphics 

• filter in DSP 

• memory protection systems 

• RISC processors 

• multi/parallel processors 

• logic analyzers 

• tag/data buffers 

The Am29337 compares incoming 16-bit data against 
both lower and upper bounds and reports whether the 



data is inside or outside the bounds. It can be cascaded 
for 32-bit data and longer without sacrificing speed. 

The Am29337 is housed in a 400 mil ceramic 28-pin DIP 
. for board space savings. 

User Benefits 

• Replaces MSI devices, saves board space 

• Low-cost solution compared to conventional alter- 
natives 

Distinctive Features 

• Double Comparators compare a 1 6-bit input num- 
ber'against a lower and an upper limit 

• 1 6-bit operation, cascadable to longer words 

• Compares signed or unsigned numbers 
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Figure 1-13. Am29337 Block Diagram 
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Am29338 32-Bit Byte Queue 

The Am29338 is a general purpose 32-bi1 intelligent 
FIFO tiiat allows up to four bytes to be queued or de- 
queued in a single cycle. 

Fabricated with AMD's IM0X-S2 technology and housed 
in a 1 20-pin PGA, the Am29338 meets the requirements 
for a high-speed FIFO buffer with minimum real estate. 
The part will also be made available in high-speed, low- 
power 1 .2 micron CMOS technology. 

Features of the Am29338 include: 

• Queuing of up to 1 28 bytes 

• Queuing or de-queuing of up to 4 bytes at a time 

• Byte rotation on the inputs and outputs 

• Asynchronous/synchronous operations 

• Accepts 8-, 16-, 24-, and 32-bit input data 

• Repetitive queuing of block data 

• Almost empty/full signal if less than 4 bytes available 



Significant User Benefits 

The Am29388 is an excellent choice for a wide variety of 
system design problems. Its benefits include: a shorter 
design cycle when compared with implementing the 
same functions with traditional FIFOs, higher perform- 
ance, off-the-shelf functionality, less board space, and 
less power than the separate parts needed to combine 
this logic. 

Applications 

• Hardware mailbox between two heterogeneous 
processors 

• I/O bus buffers between a processor and 
controller 

• Instruction prefetch queue for byte addressable 
microprocessor systems 

• Write buffer between CPU and main memory 

• Bus conversions, 8-, 16-, 24-, and 32-bits. 
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1.3 A.C. AND D.C. PARAMETER DEFINITIONS 



Definition of A.C. Switching Terms 

The higliesl operating clock frequency. 

The propagation delay time from an input change to an output LOW-to-HIGH transition. 

The propagation delay time from an input change to an output HIGH-to-LOW transition. 

Pulse width. The time between the leading and trailing edges of a pulse. 

Rise time. The time required for a signal to change from 10% to 90% of its measured values. 

Fall time. The time required for a signal to change from 90% to 1 0% of its measured values. 

Set-up time. The time interval for which a signal must be applied and maintained at one input terminal 
before an active transition occurs at another terminal. 

Hold time. The time interval for which a signal must be retained at one input afteran active transition occurs 
at another input terminal. 

HIGH to disable. The delay time from a control input change to the output transition from the HIGH-level 
to high-impedance (measured at 0.5V change). 

LOW to disable. The delay time from a control input change to the output transition from the LOW-level 
to high-impedance transition (measured at 0.5 V change). 

Enable HIGH. The delay time from a control input change to the output transition from high-impedance 
to HIGH-level. 

Enable LOW. The delay time from a control input change to the output transition from high-impedance 
to LOW-level. 

Definition of D.C. Terms 

Cpi-, Power dissipation capacitance used to determine the no-load dynamic current consumption. 

HIGH, applying to a HIGH voltage level. 
LOW, applying to a LOW voltage level. 
Input 
Output 
Current flowing out of the device. 

Current flowing into the device. 

LOW-level input current with a specified LOW-level voltage applied. 
HIGH-level input current with a specified HIGH-level voltage applied. 
LOW-level output current. 
HIGH-level output current. 
Output short-circuit source current. 

Supply current drawn by the device from the V^,;, power supply. 
Three-state off-state output current, HIGH- level voltage applied. 
Three-state off-state output current, LOW- level voltage applied. 
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Vpj, The range of supply voltage over which the device is guaranteed to operate. 

V|L The highest input voltage that is guaranteed to be recognized by the device as a logic LOW. 

V|„ The lowest input voltage that is guaranteed to be recognized by the device as a logic HIGH. 

Vql The highest logic LOW voltage guaranteed at the output terminal while sinking the specified load current 

loL- 

Vq^ The lowest logic HIGH voltage guaranteed at the output terminal when sourcing the specified source 

current 1^^. 

Iee The supply current drawn by the device from the V^^ power supply for an ECL circuit. 

Vgg Most negative power supply for an ECL circuit. 
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CMOS Family 

Am29C331 CMOS 1 6-Bit Microprogram Sequencer 2-1 

Am29C332 CMOS 32-Bit Arithmetic Logic Unit 2-38 

Am29C334 CMOS Four-Port Dual-Access Register File 2-76 

Am29C325 CMOS 32-Bit Floating-Point Processor* 2-94 

Am29C327 CMOS Double-Precision Floating-Point Processor* 2-95 



' Front page only of data sheet. See Chapter 4 for complete data sheet. 



Am29C331 

CMOS 16-Bit Microprogram Sequencer 






PRELIMINARY 



DISTINCTIVE CHARACTERISTICS 



16-Bits Address up to 64K Words 

Supports 110-ns microcycle time for a 32-bit high- 
performance system when used with the other 
members of the Am29C300 Family. 
Speed Select 

Supports 80-ns system cycle time. 
Real-Time Interrupt Support 
Micro-trap and interrupts are handled transparently 
at any microinstruction boundary. 
Built-in Conditional Test Logic 
Has twelve external test inputs, four of which are 
used to internally generate an additional four test 
conditions. Test multiplexer selects one out of 16 
test inputs. 



Break-Point Logic 

Built-in address comparator allows break-points in 
the microcode for debugging and statistics collection. 
Master/Slave Error Checking 
Two sequencers can operate in parallel as a master 
and a slave. The slave generates a fault flag for 
unequal results. 
33-Level Stack 

Provides support for interrupts, loops, and subrou- 
tine nesting. It can be accessed through the D-bus 
to support diagnostics. 
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GENERAL DESCRIPTION 



The Am29C331 is a 16-bit wide, high-speed single-chip 
sequencer designed to control the execution sequence of 
microinstructions stored in the microprogram memory. The 
instruction set is designed to resemble high-level language 
constructs, thereby bringing high-level language program- 
ming to the micro level. 

The Am29C331 is interruptible at any microinstruction 
boundary to support real-time interrupts. Interrupts are 
handled transparently to the microprogrammer as an unex- 
pected procedure call. Traps are also handled transparent- 
ly at any microinstruction boundary. This feature allows re- 
execution of the prior microinstmction. Two separate buses 
are provided to bring a branch address directly into the chip 
from two sources to avoid slow turn-on and turn-off times 
for different sources connected to the data-input bus. Four 



sets of multiway inputs are also provided to avoid slow turn- 
on and turn-off times for different branch-address sources. 
This feature allows implementation of table look-up or use 
of external conditions as part of a branch address. The 
33-deep stack provides the ability to support interrupts, 
loops, and subroutine nesting. The stack can be read 
through the D-bus to support diagnostics or to implement 
multitasking at the micro-architecture level. The master/ 
slave mode provides a complete function check capability 
for the device. 

Fabricated using Advanced Micro Devices' 1 .6 micron 
CMOS process, the Am29C331 is powered by a single 5- 
volt supply. The device is housed in a 1 20-terminal pin-grid 
array package. 
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RELATED AMD PRODUCTS 



Part No. 


Description 


Am29114 


Vectored Priority Interrupt Controller 


Am29116 


High-Performance Bipolar 16-Bit Microprocessor 


Am29C116 


Higti-Performance CMOS 16-Bit Microprocessor 


Am29PL141 


Field-Programmable Controller 


Am29C323 


CMOS 32-Bit Parallel Multiplier 


Am29325 


32-Bit Floating-Point Processor 


Am29C325 


CMOS 32-Bit Floating-Point Processor 


Am29332 


32-Bit Extended Function ALU 


Am29C332 


CMOS 32-Bit Extended Function ALU 


Am29334 


64x18 Four-Port, Dual-Access Register File 


Am29C334 


CMOS 64x18 Four-Port Dual-Access Register File 


Am29337 


16-Bit Bounds Checker 


Am29338 


Byte Queue 




'^0-^7 O— 7^ 



Mq M| Pllj M3 DA 



^'4,'4-'4^'4 



MULTI-WAY 
MUX 



TEST 
LOGIC 



J- 



TEST 
MUX 



COUMTER 

MUX 



>COUKTER 



33 k 16 

STACK 



E 



D-BUS 
MUX 



STACK 
MUX 



ADDflESS 
MUX 



INSTR 
DECODE 



> SP 



> 



INT RET 
ADDR REG 



MTERflUPT 

MUX 



-n > 



RESISTER 

AND 

MCREMENTER 

> 



COHP g3 EQUAL 
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Figure 1. Am29C331 Detailed Blocic Diagram 
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CONNECTION DIAGRAM 
120-Uad PGA* 



1 fm.O M1.0 M2,0 M2,1 5iN Ml, 2 M1,3 M2,3 GND RST INTR SLAVE D15 



DO AO M3,0 Mr,1 M0,2 H2,2 M0,3 M3,3 EQUAL OED INTEN HOLD A15 



VCC YO D1 M0,1 M3,l GND M3,2 VCC A-FULL ERROR INTA Y15 VCC 



D14 A14 Y14 



GND A2 Y2 



A3 D3 GND 



Y3 04 A4 



06 Y4 VCC 



GND AS Y5 



D6 A6 Y6 



13 Y7 



D13 A13 GND 



GND D12 Y13 



A12 Y12 011 



VCC Yll A11 



DIO AlO GND 



Y10 D9 A9 



VCC 07 T3 T6 GNO TIO T11 10 VCC 13 Y9 D8 VCC 



A7 Tl T2 T5 GND T7 SO S1 VCC 12 14 AS YS 



TO T9 T4 GND T8 OP S3 VCC i1 S2 15 FC 



CD010380 



"Pins facing up. 
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PIN 


DESIGNATIONS 


















(Sorted by Pin No.) 










PIN NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 








C-5 


Y2 


115 


H-2 


Ms, 3 


10 


M-5 


A13 


80 








C-6 


GND 


113 


H-3 


VCC 


68 


M-6 


D12 


81 








C-7 


A4 


52 


H-11 


lo 


34 


M-7 


Y12 


82 








C-8 


Vcc 


53 


H-12 


Si 


95 


M-8 


Y11 


25 


A-1 


Mo, 


1 


C-9 


Y5 


109 


H-13 


S3 


94 


M-9 


A10 


86 


A-2 


Do 


120 


C-10 


Ye 


48 


J-1 


GND 


11 


M-10 


D9 


87 


A-3 


Vcc 


59 


C-11 


T3 


44 


J-2 


EQUAL 


71 


M-11 


Ds 


89 


A-4 


Al 


58 


C-12 


T2 


104 


J-3 


A-FULL 


70 


M-12 


As 


30 


A-5 


GND 


56 


C-13 


T9 


41 


J-11 


■ Vcc 


37 


M-13 


I5 


91 


A-6 


A3 


114 


D-1 


M2, 1 


4 


J-12 


Vcc 


38 


N-1 


Dl5 


16 


A-7 


Y3 


54 


D-2 


Ml, 1 


63 


J-13 


Vcc 


39 


N-2 


A15 


76 


A-8 


D5 


51 


D-3 


Mo, 1 


3 


K-1 


RST 


13 


N-3 


Vcc 


17 


A-9 


GND 


50 


D-11 


Te 


102 


K-2 


OEd 


72 


N-4 


Y14 


19 


A-10 


D6 


49 


D-12 


T5 


43 


K-3 


ERROR 


12 


N-5 


GND 


20 


A-11 


Vcc 


47 


D-13 


T4 


103 


K-11 


I3 


92 


N-6 


Y13 


21 


A-12 


A7 


106 


E-1 


Cin 


5 


K-12 


l2 


33 


N-7 


D11 


24 


A-13 


Y7 


46 


E-2 


Mo, 2 


65 


K-13 


I1 


93 


N-8 


A11 


84 


B-1 


M1,0 


61 


E-3 


M3, 1 


64 


L-1 


INTR 


14 


N-9 


GND 


26 


B-2 


Ao 


60 


E-11 


GND 


97 


L-2 


INTEN 


74 


N-10 


A9 


28 


B-3 


Yo 


119 


E-12 


GND 


98 


L-3 


InTa 


73 


N-11 


Vcc 


29 


B-4 


Yl 


117 


E-13 


GND 


99 


L-4 


Di4 


18 


N-1 2 


Yb 


90 


B-5 
B-6 


A2 


116 


F-1 


Mi, 2 


6 


L-5 


D13 


79 


N-1 3 


FC 


31 


D3 


55 


F-2 


M2, 2 


66 


L-6 


GND 


23 








B-7 


D4 


112 


F-3 


GND 


8 


L-7 


A12 


22 








B-8 


Y4 


111 


F-11 


T10 


100 


L-8 


Vcc 


83 








B-9 


As 


110 


F-12 


T7 


42 


L-9 


D10 


85 








B-10 


Ae 


108 


F-13 


Tb 


101 


L-10 


Y10 


27 








B-11 


D7 


107 


G-1 


M1, 3 


9 


L-11 


Y9 


88 








B-12 


Tl 


45 


G-2 


Mo, 3 


67 


L-12 


I4 


32 








B-13 


To 


105 


G-3 


M3, 2 


7 


L-13 


Ss 


35 








C-1 


M2, 


2 


G-11 


T11 


40 


M-1 


SLAVE 


75 








C-2 


M3, 


62 


G-1 2 


So 


36 


M-2 


HOLD 


15 








C-3 


Di 


118 


G-13 


CP 


96 


M-3 


Yl5 


77 








C-4 


Da 


57 


H-1 


M2, 3 


69 


M-4 


Al4 


78 






i 
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PIN DESIGNATIONS 
(Sorted by Pin Name) 



PIN NAME 



PIN 

NO. 



_ 


_ 


37 


_ 


- 


39 


_ 


_ 


97 


_ 


- 


99 


A-FULL 


J-3 


70 


Ao 


B-2 


60 


Ai 


A-4 


58 


A2 


B-5 


116 


A3 


A-6 


114 


A4 


C-7 


52 


As 


B-9 


110 


As 


B-10 


108 


A7 


A-12 


106 


As 


M-12 


30 


A9 


N-1Q 


28 


A10 


M-9 


86 


All 


N-8 


84 


A12 


L-7 


22 


Al3 


M-5 


80 


Al4 


M-4 


78 


Ais 


N-2 


76 


ciii 


E-1 


5 


CP 


G-13 


96 


Do 


A.2 


120 


Di 


C-3 


118 


Dz 


C-4 


57 


D3 


B-6 


55 


D4 


B-7 


112 


DS 


A-8 


51 


De 


A-10 


49 


D7 


B-11 


107 



PAD 
NO. 



PIN NAME 



Da 

D9 

D10 

D11 

D12 

Di3 

D14 

D15 

GND 

GND 

GND 

GND 

GND 

GND 

Vcc 

Vcc 

Vcc 

Vcc 

Vcc 

Vcc 

EQUAL 
ERROR 
FC 
HOLD 

lo 

I1 
I2 
I3 
I4 
I5 
INTA 



PIN 
NO. 



M-11 

M-10 

L-9 

N-7 

M-6 

L-5 

L-4 

N-1 

E-1 2 

E-1 3 

E-11 

F-3 

L-6 

C-6 

J-13 

H-3 

C-8 

L-B 

J-12 

J-11 

J-2 

K-3 

N-1 3 

M-2 

H-11 

K-13 

K-12 

K-11 

L-12 

M-13 

L-3 



PAD 
NO. 



89 
87 
85 
24 
81 
79 
18 
16 
97 
98 
99 
8 
23 
113 
38 
68 
53 
83 
37 
39 
71 
12 
31 
15 
34 
93 
33 
92 
32 
91 
73 



PIN NAME 



INTEN 


L-2 


INTR 


L-1 


Mo, 


A-1 


Mo, 1 


D-3 


Mo, 2 


E-2 


Mo, 3 


G-2 


Ml, 


B-1 


Ml, 1 


D-2 


Ml, 2 


F-1 


Mi, 3 


G-1 ^ 


M2, 


C-1 


M2, 1 


D-1 


M2, 2 


F-2 


M2, 3 


H-1 


M3,0 


C-2 


M3, 1 


E-3 


M3,2 


G-3 


M3, 3 


H-2 


OEd 


K-2 


RST 


K-1 


So 


G-1 2 


S1 


H-1 2 


S2 


L-1 3 


S3 


H-13 


SLAVE 


M-1 


To 


B-1 3 


T1 


B-1 2 


T2 


C-1 2 


T3 


C-11 


T4 


D-1 3 


T5 


D-1 2 



PIN 
NO. 



PAD 

NO. 



74 

14 

1 

3 

65 

67 

61 

63 

6 

9 

2 

4 

66 

69 

62 

64 

7 

10 

72 

13 

36 

95 

35 

94 

75 

105 

45 

104 

44 

103 

43 



PIN NAME 



PIN 
NO. 



To 

T7 

Ts 

T9 

T10 

Til 

GND 

GND 

GND 

GND 

GND 

Vcc 

Vcc 

Vcc 

Vcc 

Yo 

Yl 

Y2 

Y3 

Y4 

Y5 

Ye 

Y7 

Yg 

Yo 

Y10 

Y11 

Y12 

Yl3 

Y14 

Y15 



D-11 

F-1 2 

F-1 3 

C-13 

F-11 

G-11 

J-1 

N-5 

A-9 

N-9 

A-5 

N-3 

N-11 

A-3 

A-11 

B-3 

B-4 

C-5 

A-7 

B-8 

C-9 

C-10 

A-13 

N-1 2 

L-11 

L-10 

M-8 

M-7 

N-6 

N-4 

M-3 



PAD 
NO. 



102 

42 

101 

41 

100 

40 

11 

20 

50 

26 

56 

17 

29 

59 

47 

119 

117 

115 

54 

111 

109 

48 

46 

90 

88 

27 

25 

82 

21 

19 

77 
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LOGIC SYMBOL 






iy^ ^- ^^ #^-^' 



"0,"o-3 "l.^O-S "2, "0.3 "3. "0-3 %-°15 V*l 



CP 

RST 

FC 

INTR 

INTEN 

HOLD 

OEq 

SLAVE 



Y„-Y,. 



A-FULL 

INTA 

EQUAL 

ERROR 



LS002872 



ORDERING INFORMATION 
Standard Products 

AMD standard products are available in several paol<ages and operating ranges. The order number (Valid Combination) 
formed by a combination of: a. Device Number 

b. Speed Option (if applicable) 

c. Package Type 

d. Temperature Range 

e. Optional Processing 



AM29C331 



G_ 



-a. DEVICE NUMBER/DESCRIPTION 

Am29C331 

CMOS 16-Bit Microprogram Sequencer 



-e. OPTIONAL PROCESSING 

Blank - Standard processing 
B = Burn-in 

-d. TEMPERATURE RANGE 

C ■= Commereial (0 to + 70°C) 

-c. PACKAGE TYPE 

G = 120-Lead Pin Grid Array without Heatsinlj 
(CGX120) 



-b. SPEED OPTION 

- 1 - Speed Select 

-2 - Speed Select (TBD) 



Valid Combinations 


AM29C331 


GO, GCB 


AM29C331-1 



Valid Combinations 

Valid Combinations list configurations planned to be 
supported in volume for this device. Consult the local AMD 
sales office to confirm availability of specific valid 
combinations, to check on newly released valid combinations, 
and to obtain additional data on AMD's standard military 
grade products. 
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MILITARY ORDERING INFORMATION 
APL Products 



AMD products for Aerospace and Defense applications are available in several paclrages and operating ranges. APL (Approved 
Products List) products are fully compliant witti MIL-STD-883C requirements. The order number (Valid Combination) for APL 
products is formed by a combination of: a. Device Number 

b. Speed Option (if applicable) 

c. Device Class 

d. Package Type 

e. Lead Finish 



AM29C331 



/B 



. DEVICE NUMBER/DESCRIPTION 

Am29C331 

CMOS 16-Bit Microprograoi Sequencer 



-e. LEAD FINISH 

C = Gold 



-d. PACKAGE TYPE 

Z-120-Lead Pin Grid Array without Heatsjnk 
(CGX120) 



-c. DEVICE CLASS 

/B = Class B 



b. SPEED OPTION 

Not Applicable 



Valid Combinations 



AM29C331 



/BZC 



Valid Combinations 

Valid Combinations list configurations planned to be 
supported in volume for this device. Consult the local AMD 
sales office to confirm availability of specific valid 
combinations or to check for newly released valid 
combinations. 



Group A Tests 

Group A tests consist of Subgroups 
1, 2, 3, 7, 8, 9, 10, 11. 



2-7 



PIN DESCRIPTION 



Aq-A-is Alternate Data (Input) 

Input to address multiplexer and counter. 

A-FULL Almost Full (Bidirectional; Three-State) 

Indicates that 28 < SP < 63 (meaning there are five or less 
empty locations left on stack). Also active during stacl< 
underflow. 



C|n Carry In (Input, Active LOW) 

Carry-in to the incrementer. 

CP Clock Pulse (Input) 

Clocks sequencer at the LOW-to-HIGH transition. 

Do -Dig Data (Bidirectional, Three-State) 

Input to address multiplexer, counter, stack, and comparator 
register. Output for stack and stack pointer. 

EQUAL Equal (Bidirectional, Three-State) 

Indicates that the address comparator is enabled and has 
found a match. 

ERROR Error (Output) 

Indicates a master/slave error in the slave mode. Indicates 
a malfunctioning driver or contention of any output in the 
master mode. 

FC Force Continue (Input) 

Overrides instruction with CONTINUE. 

HOLD Hold (Input) 

Stops the sequencer and three-states the outputs. 



I0-I5 Instruction (Input) 

Selects one of 64 instructions. 

INTA Interrupt Acknowledge (Bidirectional; Three- 
State, Active LOW) 

Indicates that an interrupt is accepted. 

INTEN Interrupt Enable (Input) 

Enables interrupts. 

INTR Interrupt Request (Input) 

Requests the sequencer to interrupt execution. 

Mo-3, 0-3 Multiway (Input) 

Four sets of multiway inputs providing 16-way branches. 
The first index refers to the set numt)er. 

OEd Output Enable — D-Bus (Input) 

Enables the D-bus driver, provided that the sequencer is not 
in the hold or slave mode. 



RST Reset (Input; Active LOW) 

Resets the sequencer. 

So -S3 Select (Input) 

Selects one of 16 test conditions. 

SLAVE Slave (Input) 

Makes the sequencer a slave. 

T0-T11 Test (Input) 

Provides external test inputs. 

Y0-Y15 Address (Bidirectional; Three-State) 

Output of microcode address. Input for interrupt address. 



FUNCTIONAL DESCRIPTION 

Architecture 

The major blocks of the sequencer are the address multiplex- 
er, the address register (AR), the stack (with the top of stack 
denoted TOS), the counter (C), the test multiplexer with logic, 
and the address comparison register (R) (Figure 1). The 
bidirectional D-bus provides branch addresses and iteration 
counts; it also allows access to the stack from the outside. 
The A-bus may be used for map addresses. There are four 
sets of four-bit multiway branch inputs (M). The bidirectional Y- 
bus either outputs microprogram addresses or inputs interrupt 
addresses. The buses are all 16 bits wide. Figure 1 shows a 
detailed block diagram of the sequencer. 

Address Multiplexer 

The address multiplexer can select an address from any of 
five sources: 

1) A branch address supplied by the D-bus 

2) A branch address supplied by the A-bus 



3) A multiway-branch address 

4) A return or loop address from the top of stack 

5) The next sequential address from the incrementer 

Multiway-Branch Address 

A multiway-branch address is formed by substituting the lower 
four bits of the address on the D-bus (D3, Dg, Di, Dq) with one 
of the four sets (Mqx. Mix. Mgx, or Max) of four-bit multiway- 
branch addresses. The multiway-branch set is selected by the 
number D1D0, while the bits D3 and D2 are "don't cares" (see 
Figure 2). 



Di 


Do 


Multiway Set Selected 








Mox 





1 


Mix 


1 





Max 


1 


1 


Max 
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■'IS 



Branch 
Address 



Multiway Inputs 



M, Mn 



Address 
Out 



Table4(M3x) 








Tabla3(M2x) 








Table2(M,x) 






Table 1 (Mqx) 



15 



Lookup Table 

BD007460 

Notes: 1. Di and Dq select one out of four multiway sets. D3 and D2 are "don't cares." 

2. Each set of M3X-M0X can select one of sixteen locations. The multiway-branch address is the 
concatenation of D15-D4 (base address) and Mxs-Mxo- 

3. For a given base address, there can be four lool(-up tables, each sixteen deep. 

Figure 2. Multiway Branch 



Address Register and Incrementer 

The address register contains the current address. It is loaded 
from the interrupt multiplexer and feeds the incrementer. The 
incrementer is inhibited if C|n is fallen HIGH. 

Stack 

A 33-word-deep and 16 bit-wide stacl< provides first-in last-out 
storage for return addresses, loop addresses, and counter 
values. Items to be pushed come from the incrementer, the 
interrupt-return-address register, the counter, or the D-bus. 
Items popped go to the address multiplexer, the counter, or 
the D-bus. 

The access to the stacl< via the D-bus may be used for context 
switching, stack extension, or diagnostics. As the stack is only 
accessible from the top, stack extension is done by temporari- 
ly storing the whole or some lower part of the stack outside the 
sequencer. The save and the later restore are done with pop 
and push operations, respectively, at balanced points in the 
microprogram; for example, points with the same stack depth. 
The internal D-bus driver must be turned on when popping an 
item to the D-bus; if the driver is off, the item will be unstacked 
instead. The driver is normally turned on when the Output 
Enable sig nal is asserted and the sequencer is not being reset 
(0Ed = 1, RST = 1). 

The stack pointer is a modulo 64 counter, which is increment- 
ed on each push and decremented on each pop. The stack 
pointer is reset to zero when the sequencer is reset, but the 
pointer may also be reset by instruction. Thus, the stack 
pointer indicates the number of items on the stack as long as 
stack overflow or unaerfiow has not occurred. Overflow 
happens when an item is pushed onto a full stack, whereby 
the item at the bottom of the stack is ovenwritten. Underflow 



happens when an item is popped from an empty stack; in this 
case the item is undefined. 

In the case of stack overflow, the SP is incremented for every 
push after overflow. Thus, immediately after the first occu- 
rence of stack overflow, the SP will be equal to 34. Subse- 
quent pushes will increment the SP to 35, 36 ... 61, 62, 63, 0, 
1, etc. In the case of stack underflow, the SP is decremented 
for every pop after underflow. Thus, immediately after the first 
occurrence of stack underflow, the SP will be equal to 63. 
Subsequent pops will decrement the SP to 62, 61, ... 2, 1, 0,. 
63, etc. 

The contents of the stack pointer are present on the D-bus for 
all instructions except POP D, provided the driver is turned on. 
The output signal, A-FULL, is active under the following 
condition: 28<SP<63, 

Counter 

The counter may be used as a loop counter. It may be loaded 
from the D-bus, the A-bus, or via a pop from the stack. Its 
contents may also be pushed onto the stack. 

A normal for-loop is set up by a FOR instruction, which loads 
the counter from the D- or A-bus with the desired number of 
iterations; the instruction also pushes onto the stack a loop 
address that points to the next sequential instruction. The end 
of the loop is given by an unconditional END FOR instruction, 
which tests the counter value against the value one and then 
decrements the counter. If the values differ, the loop is 
repeated by selecting the address at the stack as the next 
address. If the values are equal, the loop is terminated by 
popping the stack, thereby removing the loop address, and 
selecting the address from the incrementer as the next 
address. The number of iterations is a 1 6-bit unsigned number, 
except that the number zero corresponds to 65,536 iterations. 
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By pushing and popping counter values it is possible to handle 
nested loops. 

Address Comparison 

The sequencer is able to compare the address from the 
interrupt multiplexer with the contents of the comparator 
register. The instruction SET loads the comparator register 
with the address on the D-bus and enables the comparison, 
while CLEAR disables it. The comparison is disabled at reset. 
A HIGH is present at the output EQUAL if the comparison is 
enabled and the two addresses are equal. The comparison is 
useful for detection of a break point or counting the number of 
times a microinstnjction at a specific address is executed. 

Instruction Set 

The sequencer has 64 instructions that are divided into four 
classes of 16 instnjctions each. The instruction lines I0-I5 
use Is and I4 to select a class, and I0-I3 to select an 
instruction within a class. The classes are: 



I5 




1 
1 



Classes 

Conditional sequence control, 

Conditional sequence control with inverted 

polarity. 

Unconditional sequence control, and 

Special function with implicit continue. 



Note that for the first three classes Is forces the condition to 
be true and I4 inverts the condition. The basic instnjctions of 
the first three classes are shown in Table 1 and the instruc- 
tions of the fourth class in Table 2. 

Structured microprogramming is supported by sequencer 
instructions that singly or in pairs correspond to high-level 
language control constructs. Examples are FOR I: = D IDOWN 
TO 1 DO . . . END FOR and CASE N OF . . . END CASE. The 
instructions have been given high-level language names 
where appropriate. Figure 2 shows how to microprogram 
important control constructs; the high-level language is on the 
left and the microcode on the right. 



Test Conditions 

The condition for a conditional instruction is supplied by a test 
multiplexer, which selects one out of sixteen tests with the 
select lines Sq - 83. Twelve of these are supplied directly by 
the inputs To - T^ 1 , while the remaining four tests are generat- 
ed by the test logic from the inputs Ts-Tn. The following 
table shows the assignments. 



(So-S 


s)HTest 


Intended Use 


0-7 


T0-T7 


General 


8 


Ta 


C (Carry) 


9 


T9 


N (Negative) 


A 


T10 


V (Overflow) 


B 


T11 


Z (Zero or equal) 


C 


T8 + T11 


C + Z (Unsigned less 

than or equal, borrow mode) 


D 


T8 + T11 


C + Z (Unsigned less 
than or equal) 


E 


TgeTio 


N©V (Signed less than) 


F 


(T9®Tio) + Tii 


(N©V) + Z (Signed less 
than or equal) 



Force Continue 

The sequencer has a force continue (FC) input, which over- 
rides the instmction inputs Iq-Is with a CONTINUE instruc- 
tion. This makes it possible to share the microinstruction field 
for the sequencer instruction with some other control or to 
initialize a writable control store. 

Reset 

In order to start a microprogram properly, the sequencer must 
be reset. The reset works like an instruction overriding both 
the instruction input and tiie force continue input. The reset 
selects the address at the address multiplexer, forces the 
EQUAL output to LOW, and disregards a potential interrupt 
request. It synchronously disables the address comparison 
and initializes the stack pointer to 0. The contents of the stack 
are invalid after a reset. 
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TABLE 


1. INSTRUCTION SET for I5I4 = 


00, 01, 10 










Cond.: Fail 


Cond.: Pass 








is-io 


Instruction 


Y 


Stacl( 


Y Stacit 


Counter 


Comp. 


D-Mux 


00, 10, 20 


Goto b 


INC 


_ 


D 


- 


- 


- 


SP 


01, 11, 21 


Call D 


INC 


- 


D 


Pusfi INC 


- 


- 


SP 


02, 12, 22 


Exit D 


INC 


- 


D 


Pop 


- 


- 


SP 


03, 13, 23 


End for D, C ¥= 1 


INC 


- 


D 


- 


C^C-1 


- 


SP 




End for D, C = 1 


INC 


- 


INC 


- 


C^C-1 


- 


SP 


04, 14, 24 


Goto A 


INC 


- 


A 


- 


- 


- 


SP 


05, 15, 25 


Call A 


INC 


- 


A 


Pusti INC 


- 


- 


SP 


06, 16, 26 


Exit A 


INC 


- 


A 


Pop 


- 


- 


SP 


07, 17, 27 


End for A, C=^1 


INC 


- 


A 


- 


C-<-C-1 


- 


SP 




End for A, C = 1 


INC 


- 


INC 


- 


C^-C-1 


- 


SP 


08, 18, 28 


Goto M 


INC 


_ 


D:M 


- 


- 


- 


SP 


09, 19, 29 


Call M 


INC 


- 


D:M 


Push INC 


- 


- 


SP 


OA, 1A, 2 A 


Exit M 


INC 


_ 


D:M 


Pop 


- 


- 


SP 


OB, IB, 28 


End for M, C ^t i 


INC 


_ 


D:M 


- 


C^C-1 


- 


SP 




End for M, C = 1 


INC 


- 


INC 


- 


C^C-1 


- 


SP 


OC, 1C, 2C 


End Loop 


INC 


Pop 


TOS 


- 


- 


- 


SP 


OD, 1D, 2D 


Call Coroutine 


INC 


- 


TOS 


Pop & 
Push INC 


~ 


~ 


SP 


OE, 1E, 2E 


Return 


INC 


_ 


TOS 


Pop 


- 


- 


SP 


OF, IF, 2F 


End for, C # 1 


INC 


Pop 


TOS 


- 


C^C-1 


- 


SP 




End for, C = 1 


INC 


Pop 


INC 


Pop 


C^C-1 


- 


SP 



Cond. = (Test [s] OR I5) XOR I4 
= Concatination 

C = Counter 

INC = Output of Incrementer = AR + 1 (if Qn = LOW) 

Note: For unconditional instructions, the action marked under "Cond: Pass" is taken. 





TABLE 


2. INSTRUCTION SET for I5I4 = 


= 11 




I5-I0 


Instruction 


Y 


Stack 


Counter 


Comp. 


D-Mux 


30 


Continue 


INC 


_ 


- 


- 


SP 


31 


For D 


INC 


Push INC 


C^D 


- 


SP 


32 


Decrement 


INC 


- 


C-i-C-1 


- 


SP 


33 


Loop 


INC 


Push INC 


- 


- 


SP 


34 


Pop D 


INC 


Pop 


- 


- 


TOS 


35 


Push D 


INC 


Push D 


- 


- 


SP 


36 


Reset SP 


INC 


SP^ 


- 


- 


SP 


37 


For A 


INC 


Push INC 


C^A 


- 


SP 


38 


Pop C 


INC 


Pop 


C-TOS 


- 


SP 


39 


Push C 


INC 


Push C 


- 


- 


SP 


3A 


Swap 


INC 


TOS^C 


C^TOS 


- 


SP 


3B 


Push C Load D 


INC 


Push C 


C-i-D 


- 


SP 


3C 


Load D 


INC 


- 


C-<-D 


- 


SP 


3D 


Load A 


INC 


- 


C-^A 


- 


SP 


3E 


Set 


INC 


- 


- 


R-<-D, Enable 


SP 


3F 


Clear 


INC 


- 


- 


Disable 


SP 



R = Comp. Register 
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Interrupts 

The sequencer may be interrupted at the completion of the 
current microcyde by asserting the interrupt request input 
INTR. The return address of the interrupted routine is saved 
on the stacl^ so that nested interrupts can be easily imple- 
mented. An interrupt is accepted if internjpts are enabled and 
the sequencer Is not being reset or held (INTEN = HIGH, 
R§T = HIGH, and HOLD = LOW). The interrupt-acknowledge 
output (INTA) goes LOW when an interrupt is accepted. 

When there is no interrupt, addresses go from the address 
multiplexer to the Y-bus via the driver, and to the address 
register and the comparator via the interrupt multiplexer. When 
there Is an interrupt, the driver of the sequencer is turned off, 
an external driver is turned on, and the interrupt multiplexer is 
switched. The interrupt address is supplied via the external 
driver to the Y-bus, the address register, and the comparator 
(Figure 4). In order to save the address from the address 
multiplexer, the address Is stored in the interrupt return 
address register, which for simplicity is clocked every cycle. 
The next microinstruction is the first microinstruction of the 
interrupt routine (Figure 5). 

In this cycle the address In the interrupt return address register 
is automatically pushed onto the stack. Therefore the microin- 
struction in this cycle must not use the stack; if a stack 
operation is programmed, the result is undefined. The instruc- 
tions that do not use the stack are GOTO D, GOTO A, GOTO 
M, CONTINUE, DECREMENT, LOAD D, LOAD A, SET and 
CLEAR. A RETURN instruction terminates the interrupt routine 
and the interrupted routine is resumed. Interrupts only work 
with a single-level control path. 

Traps 

A trap is an unexpected situation linked to current microin- 
struction that must be handled before the microinstmction 
completes and changes the state of the system. An example 
of such a situation Is an attempt to read a word from memory 
across a word boundary in a single cycle. When a trap occurs, 
the current microinstruction must be aborted and re-executed 
after the execution of a trap routine, which in the meantime will 
take corrective measures. An interrrupt, on the other hand, is 
not linked directly to the current microinstruction that can 
complete safely before an interrupt routine is executed. 

Execution of a trap requires that the sequencer ignore the 
current microinstruction, select the trap return address at the 
address multiplexer, and initiate an interrupt. This will save the 
trap return address on the stack and issue the trap address 
from an external source (Figure 6). The address register 



contains the address of the microinstruction in the pipeline 
register, thus the address register already contains the trap 
return address when a trap occurs. This address can be 
selected by the address multiplexer by disabling the incremen- 
ter (C|N = 1), and using the force continue mode (FC = 1). In 
this mode the sequencer ignores the current microinstruction. 
The remaining part of the trap handling is done by the interrupt 
(Figure 7), thus the section on interrupts also applies to traps. 
There is one exception, however. The interrupt enable cannot 
be used as a trap enable as it does not control the force 
continue mode and the carry-in to the incrementer. 

Hold Mode 

The sequencer has a hold mode in which the operation is 
suspended. 

The outputs (Y, INTA, A-FULL & EQUAL) are disabled and the 
sequencer enters the hold mode immediately after the HOLD 
signal goes active. While the sequencer is in this mode, the 
internal stat e is le ft unchanged and the D-bus is disabled. The 
outputs (Y, INTA, A-FULL & EQUAL) are enabled again and 
the sequencer leaves the hold mode after the cycle immedi- 
ately after the HOLD signal goes inactive. 

In a time-multiplexad multi-microprocess system there may be 
one sequencer for all processes with microprogrammed con- 
text save and restore, or there may be one sequencer per 
microprocess permitting fast process switch. In the latter case 
the Y-buses of the sequencers are tied together and connect- 
ed to a single microprogram store. A control unit decides on a 
cyole-by-cycle basis what sequencer should be running, and 
activates the HOLD signal to the remaining sequencers. The 
hold mode has higher priority than interrupts, and works 
independently of the reset. The hold mode can only be used 
with a single-level control path. 

Master/Slave Configuration 

in some systems reliability is very important. The master/slave 
configuration that consists of two sequencers operated in 
parallel is able to detect faults in both the interconnect and the 
internal function of the sequencers. One sequencer is the 
master and operates normally. The other is the slave, i.e., all 
outputs except the signal ERROR are turned into inputs and 
connected to the outputs of the master. Since the slave is 
operated in parallel with the master, it can compare its result 
with the result of the master and signal an error if they differ. 
The error signal from the master indicates a malfunctioning 
driver or contention. Because a TTL output goes HIGH when 
power is missing, the ERROR signal also indicates power 
failure. 
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High-Level Language Constructs 

An example of high-level language constructs using Am29C331 instructions is given in Figure 3 (3-1, 3-2, 3-3, and 3-4). 



REPEAT 
UNTIL CC 
WHILE CC DO 



LOOP 

END LOOP NOT CC 



LOOP 

IF NOT CC THEN EXIT L 



END WHILE END LOOP 

L: 



LOOP LOOP 

IF CC THEN EXIT IF CC THEN EXIT L 

END LOOP END LOOP 
L: 



Figure 3-1. Loops with Unltnown Number 
of Iterations 



PUSH D B 
CASE I OF GOTO M 



A: 

k + 2: - 

A -I- 4: -' 

A + 6: -' 



-, RETURN (TO B) 
-,, RETURN (TO B) 
-, RETURN (TO B) 
RETURN 



END CASE B: 



Figure 3-3. Case Statement 

(witli D = Ai5 . . . A4XXOO and 
Mo, o-3 = A3li>oO during tlie 
GOTO lUI instruction. A1A0 must 
be 00, and X signifies a don't 
care.) 



FOR CNT: = 10 DOWN TO 1 DO FOR D 10 
END FOR END FOR 



Figure 3-2. Loop witti Known Number of 
iterations 



PUSH D C 
IF X THEN IF NOT X THEN GOTO A 
IF Y THEN IF NOT Y THEN GOTO B 



ELSE 



-, RETURN (TO C) 
B: 



-, RETURN (TO C) 
END IF 
ELSE A: 

IF Z THEN IF NOT Z THEN GOTO D 



ELSE 



END IF 
END IF 



-, RETURN go D) 
D: 



RETURN (TO C) 



C: 



Figure 3-4. Doubie-Nested If Statement 
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WMte eKecudng the bw. at A, the seq. i 
Intarrupted and directed to B. 



Executing at A. 



A 
A^l 

S 
8^1 



Continue 
Continue 



FL 



Int. Ret. 
Atfdr Reg. 



Y 
On 



"L 



Af 1 



& 

>lnCTem 



AF004191 



Figure 4. Am29C331 Interrupt Cycle 1 



A trap occurs at the inst. A, and me seq, is 
directed to B. 



Executing at A. 



A : Imtruction Trapped By FC s= 1. 

c;;; - 1. intr = i 



B : Continue 
B + 1; ... 



^JL 



Int tIM. 
Adtt. Reg 



Hui - ■ 



Mi. 
Heo. 



D 

V 

On 

B 

Figure 6. Am2gC331 Traps Cycle 1 



AF004201 



Ex*cu1ing M B. 



Aft 
Slack 



nil r u.." 



■JL 



Mux , 'nt R«t. 

Met. R«fl. 



_} B*1 



Y 

Q 



» 



Adct. 
Reg. 

& 
>lncrem. 



OtI 



-D^ 



AF00421 1 

Figure 5. Am29C331 Interrupt Cycle 2 



CuCMiMMa. 


SMk 

> 






( 






IkH 


1 |r 


A 


JL 


1*11 


MRM. 
^ Acidr. nio. 


Addr. 

H09. 
a 

^klcrom 


1 

0. y 




_) 








Mu> 







->- 



AF004181 



Figure 7. Am29C331 Traps Cycle 2 
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Instruction Set Definition 



Legend: • = Other instruction 

© = Instruction being described 
CC = (Test [S3 - Sol) 



Opcode 
(I5-I0) 



Mnemonics 



20h 

24h 
28h 



BRA_M 



2Ch 


BRA_S 


OOh 


BRCC_D 


04h 


BRCC_A 


08h 


BRCC_M 


OCh 


BRCC_S 



P = Test pass 

F = Test fail 

o = Register In part 



Description 



GOTO D 

Unconditional branch to the address specified 
by the D inputs. The D port must be disabled to 
avoid bus contention. 

GOTO A 

Unconditional branch to the address specified 

by the A inputs. 

GOTO Multiway (D15-D4 Mxs-Mxo) 
Unconditional branch to the address specified 
by the M inputs concatenated with the D input. 
The lower four bits on the D bus (D3 - Do) are 
replaced by one of the four sets of the four-bit 
multiway branch addresses. The multiway 
branch set is selected by bits Di and Do while 
bits D3 and D2 are "don't cares." 

GOTO TOS 

Unconditional branch to the address on the top 

of the staol<. 



IF CC THEN GOTO D 

ELSE CONTINUE 

If CC is HIGH (pass), branch to the address 

specified by D. If CC is LOW (fail), continue. 

The D port must be disabled to avoid bus 

contention. 

IF CC THEN GOTO A 

ELSE CONTINUE 

If CC is HIGH (pass), branch to the address 

specified by A. If CC is LOW (fail), continue. 

IF CC THEN GOTO Multiway 
(Di5 - D4 Mx3 - Mxo) 
ELSE CONTINUE 

If CC is HIGH (pass), branch to the address 
specified by D inputs concatenated with the M 
inputs. If CC is LOW (fail) continue. The lower 
four bits on the D bus (D3 - Do) are replaced by 
one of the four sets of the 4-bit multiway 
branch addresses. The multiway branch set is 
selected by bits Di and Do while bits D3 and Dj 
are "don't cares." 

IF CC THEN GOTO TOS 

ELSE 

POP STACK 

CONTINUE 

If CC is HIGH (pass), branch to the address on 

the top of the staol(. If CC is LOW (fail), pop the 

stack and continue. 



Note: Opcode numbers are in hexadecimal notation. 



Execution Example 





PF001740 



2-15 



Opcode 
(I5-I0) 



Mnemonics 



IOh 



Uh 



IBh 



BRNC D 



BRNC A 



BRNC_M 



1Ch 



BRNC S 



21H 



25h 



29h 



CALL_D 



CALL_M 



2Dh 



CALL_S 



Description 



IF NOT CC THEN GOTO D 

ELSE CONTINUE 

If CC is LOW (pass), branch to the address 

specified by D. If CC is HIGH (fall), continue. 

The D Port must be disabled to avoid Bus 

contention. 

IF NOT CC THEN GOTO A 

ELSE CONTINUE 

If CC is LOW (pass), branch to the address 

specified by A. If CC is HIGH (fail), continue. 

IF NOT CC THEN GOTO Multiway 
(D15-D4 Mx3-Mxo) 
ELSE CONTINUE 

If CC is LOW (pass), branch to the address 
specified by D inputs concatenated with the M 
inputs. If CC is HIGH (fail), continue. The lower 
four bits on the D bus (D3 - Do) are replaced by 
one of the four sets of the 4-bit multiway 
branch addresses. The multiway branch set is 
selected by bits Di and Dq white bits 03 and Dj 
are "don't cares." 

IF NOT CC THEN GOTO TOS 

ELSE 

POP STACK 

CONTINUE 

If CC is LOW (pass), branch to the address on 

the top of the stack. If CC is HIGH (fail), pop the 

stack and continue. 



CALL D 

Unconditional branch to the subroutine 

specified by the D inputs. Push the return 

address (address Reg. + 1) on the stack. The 

D port must be disabled to avoid bus 

contention. 

CALL A 

Unconditional branch to the subroutine 
specified by the A inputs. Push the return 
address (Address Reg. + 1) on the stack. 

CALL Multiway (0,5-04 Mxa - Mxo) 
Unconditional branch to the subroutine 
specified by the inputs concatenated with the 
multiway inputs. Push the return address 
(Address Reg. + 1) on the stack. The lower 
four bits on the D bus (Dg - Do) are replaced by 
one of the four sets of the 4-bit multiway 
branch addresses. The multiway branch set is 
selected by bits D, and Do while bits D3 and Dj 
are "don't cares." 

CALL TOS 

Unconditional branch to the subroutine 
specified by the address on the top of the 
stack. The stack is popped and the return 
address (Address Reg. + 1) is then pushed 
onto the stack. 



Execution Example 





PF001760 



Note: Opcode numbers are in hexadecimal notation. 
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Opcode 
(Is-'o) 

01h 



05h 



Mnemonics 



Description 



Execution Exampie 



CCC_D 



CCC A 



09h 



ODh 



CCC_S 



11H 



15h 



19h 



CNC_A 



IF GC, THEN CALL D 

ELSE CONTINUE 

If CC is HIGH (pass), call the subroutine 

specified by the D inputs. Push the return 

address (Address Reg. + 1) on the stack. If CC 

is LOW (fail), continue. The D port must be 

disabled to avoid bus contention. 

IF CC, THEN CALL A 

ELSE CONTINUE 

If CC is HIGH (pass), call the subroutine 

specified by the A inputs. Push the return 

address (Address Reg. + 1) on the stack. If CC 

is LOW (fail), continue. 

IF CC, THEN CALL Multiway 

(D15-D4 Mx3-Mxo) 

ELSE CONTINUE 

If CC is HIGH (pass), call the subroutine 

specified by the D inputs concatenated with the 

M inputs. Push the return address (Address 

Reg. + 1) on the stack. The lower four bits on 

the D bus (D3 - Do) are replaced by one of the 

four sets of the 4-bit multiway branch 

addresses. The multiway branch set is selected 

by bits Di and Do white bits D3 and D2 are 

"don't cares." 

IF CC, THEN CALL TOS 

ELSE CONTINUE 

If CC is HIGH (pass), call the subroutine 

specified by the address on the top of the 

stack. The stack is popped and the return 

address (Address Reg. + 1) is pushed onto the 

stack. If CC is LOW (fail), continue. 



IF NOT CC, THEN CALL 

ELSE CONTINUE 

If CC is LOW (pass), call the subroutine 

specified by the D inputs. Push the return 

address (Address Reg. + 1) on the stack. If CC 

is HIGH (fail), continue. The D port must be 

disabled to avoid bus contention. 

IF NOT CC, THEN CALL A 

ELSE CONTINUE 

If CC is LOW (pass), call the subroutine 

specified by the A inputs. Push the return 

address (Address Reg. + 1) on the stack. It CC 

is HIGH (fail), continue. 

IF NOT CC, THEN CALL Multiway 

(D15-D4 Mx3-Mxo) 

ELSE CONTINUE 

If CC is LOW (pass), call the subroutine 

specified by the D inputs concatenated with the 

M inputs. Push the return address (Address 

Reg. + 1) on the stack. The lower four bits on 

the D bus (D3 - Do) are replaced by one of the 

four sets of the 4-bit multiway branch 

addresses. The multiway branch set is selected 

by bits Di and Dq while bits D3 and D2 are 

"don't cares." 




PFCI01770 



STACK 
52 ($) F O""" '"'' " 




1Dh 



CNC S 



IF NOT CC, THEN CALL TOS 

ELSE CONTINUE 

If CC is LOW (pass), call the subroutine 

specified by the address on the top of the 

stack. The stack is popped and the return 

address (Address Reg. + 1) is pushed onto the 

stack. 



Note: Opcode numbers are in hexadecimal notation. 
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Opcode 
(Is -to) 



Mnemonics 



22h 



26h 



EXIT_D 



EXIT A 



2Ah 


EXIT_M 


2Eh 


EXIT_S 


OZh 


XTCC_D 


06h 


XTCC_A 


OAh 


XTCC_M 



OEh 



XTCC S 



Description 



EXIT TO D 

Unconditional branch to ttie address specified 

l)y the D inputs and pop, the stack. The D port 

must be disabled to avoid bus contention. 

EXIT TO A 

Unconditional branch to the address specified 

by the A inputs and pop the stack. 

EXIT TO Multiway (D15-D4 Mx3-Mxo) 
Unconditional liranch to the address specified 
by the D inputs concatenated with the M inputs 
and pop the stack. The tower four bits on the D 
bus (D3 - Do) are replaced by one of the four 
sets of the 4-bit multiway branch addresses. 
The multiway branch set is selected by bits D-\ 
and Do while D3 and D2 are "don't cares." 

EXIT TO TOS 

Unconditional branch to the address on the top 
of the stack and pop the stack. Also used for 
unconditional returns. 



IF CC, THEN EXIT TO D 

ELSE CONTINUE 

If CC is HIGH (pass), exit to the address 

specified by the D inputs and pop the stack. If 

CC Is LOW (fail), continue with no pop. The D 

port must be disabled to avoid bus contention. 

IF CC, THEN EXIT TO A 

ELSE CONTINUE 

If CC Is HIGH (pass), exit to the address 

specified by the A inputs and pop the stack. If 

CC is LOW (fail), continue with no pop. 

IF CC, THEN EXIT TO Multiway 
(D15-D4 Mx3-Mxo) 
ELSE CONTINUE 

If CC is HIGH (pass), exit to the address 
specified by the D inputs concatenated with the 
M inputs and pop the stack. The lower four bits 
on the D bus (Dg - Do) are replaced by one of 
the four sets of the 4-bit multiway branch 
addresses. The multiway branch set is selected 
by bits Di and Dq while bits D3 and Dj are 
"don't cares." 

IF CC, THEN EXIT TO TOS 

ELSE CONTINUE 

If CC is HIGH (pass), exit to the address on the 

top of the stack and pop the stack. If CC Is 

LOW (fail), continue with no pop. Also used for 

conditional returns. 



Note: Opcode numbers are in hexadecimal notation. 



Execution Example 



sig)— • » 



—a 

STACK 



PFCX)1790 



STACK y 




PFOOIBOO 
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Opcode 
(l5-'o) 



16h 



Mnemonics 



XTNC_A 



1Ah 



IEh 



XTNC S 



23h 



DJMP_D 



27h 



DJMP_A 



2Bh 



DJMP_M 



2Fh 



DJMP_S 



Description 



Execution Exampie 



IF NOT CC, THEN EXIT TO D 
ELSE CONTINUE 

If CC is LOW (pass), exit to the address 
specified by the D inputs and pop ttie stack. If 
CC is HIGH (fail), continue with no pop. The D 
port must be disabled to avoid bus contention. 

IF NOT CC, THEN EXIT TO A 
ELSE CONTINUE 

If CC is LOW (pass), exit to the address 
specified by the A inputs and pop the stack. If 
CC is HIGH (fail), continue with no pop. 
IF NOT CC, THEN EXIT TO Multiway 
(D15-D4 Mx3-Mxo) 
ELSE CONTINUE 

If CC is LOW (pass), exit to the address 
specified by the D inputs concatenated with the 
M inputs and pop the stack. The lower four bits 
on the D bus (D3 - Do) are replaced by one of 
the four sets of the 4-bit multiply branch 
addresses. The multiway branch set is selected 
by bits Di and Dq while bits D3 and D2 are 
"don't cares." 

IF NOT CC, THEN EXIT TO TOS 
ELSE CONTINUE 

If CC is LOW (pass), exit to the address on the 
top of the stack and pop the stack. If CC is 
HIGH (fail), continue with no pop. Also used for 
conditional returns. 




PF001810 



IF CNT*1 THEN CNT:-CNT-1 

GOTO D 

ELSE CNT: =CNT-1 

CONTINUE 

If the counter is not equal to one, decrement 

the counter and branch to the address 

specified by the D inputs. If the counter is equal 

to one, then decrement the counter and 

continue, the D port must be disabled to avoid 

bus contention. 

IF CNT^t THEN CNT: = CNT -1 

GOTO A 

ELSE CNT; = CNT -1 

CONTINUE 

If the counter is not equal to one, decrement 

the counter and branch to the address 

specified by the A inputs. If the counter is equal 

to one, then decrement the counter and 

continue. 

IF CNT^^I THEN CNT: - CNT -1 
GOTO l^ultiway (D15-D4 Mxa-Mxo) 
ELSE CNT: = CNT - 1 
CONTINUE 

if the counter is not equal to one, decrement 
the counter and branch to the address 
specified by the D inputs concatenated with the 
M inputs. The lower four bits on the D bus 
(D3 - Do) are replaced by one of the four sets 
of the 4-bit multiway branch addresses. The 
multiway branch set is selected by bits Di and 
Do while bits D3 and D2 are "don't cares." 

IF CNT 1^1 THEN CNT: = CNT - 1 

GOTO TOS 

ELSE CNT: = CNT- 1 

POP STACK 

CONTINUE 

If the counter is not equal to one, decrement 

the counter and branch to the address on the 

top of the stack. If the counter is equal to one, 

then decrement the counter, pop the stack and 

continue.' 



, COUNTER * 1 



COUNTER 
-- Q ■ COUNT- 



54 • COUNTER = 1 



Note: Opcode numbers are in hexadecimal notatkjn. 
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Opcode 
(Is-lo) 



Mnemonics 



03h 



DJCC D 



07h 



DJCC_A 



OBh 



DJCC M 



OFh 



DJCC S 



Description 



IF CC AND CNT * 1 THEN CNT: = CNT - 1 

SOTO D 

ELSE CNT: = CNT -1 

CONTINUE 

If CC is HIGH (pass) and the counter is not 

equal to one, decrement the counter and 

branch to the address specified by the D 

inputs. If CC is LOW (fail) or the counter Is 

equal to one, then decrement the counter and 

continue. The D port must be disabled to avoid 

bus contention. 

IF CC AND CNT =/= 1 THEN CNT: = CNT- 1 

GOTO A 

ELSE CNT: = CNT-1 

CONTINUE 

If CC is HIGH (pass) and the counter is not 

equal to one, decrement the counter and 

branch to the address specified by the A inputs. 

If CC is LOW (fail) or the counter is equal to 

one, then decrement the counter and continue. 

IF CC AND CNT¥=1 THEN CNT: - CNT-1 
GOTO Multiway (D15 - D4 Mxg - Mvo) 
ELSE CNT: = CNT-1 
CONTINUE 

If CC is HIGH (pass) and the counter is not 
equal to one, decrement the counter and 
branch to the address specified by the D inputs 
concatenated with the M inputs. The lower four 
bits on the D bus (D3 - Do) are replaced by one 
of the four sets of the 4-bit multiway branch 
addresses. The multiway branch set is selected 
by bits Di and Do while bits Da and D2 are 
"don't cares." 

IF CC AND CNT¥=1 THEN CNT:- CNT-1 

GOTO TOS 

ELSE CNT: -CNT-1 

POP STACK 

CONTINUE 

If CC is HIGH (pass) and the counter is not 

equal to one, decrement the counter and 

branch to the address on the top of the stack. If 

CC is LOW (fall) or the counter is equal to one, 

then decrement the counter, pop the stack and 

continue. 



Execution Example 




FOR 

COUhfTEn = 1 



COUNTER 
--■Q ■ COUHT- 



PFOoiaso 



Note: Opcode numbers are in hexadecimal notation. 
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Opcode 
ds-'o) 

13h 



Mnemonics 



DJNCC_D 



17h 



DJNCC_A 



1Bh 



DJNCC_M 



1Fh 



DJNCX;_S 



Description 



IF NOT CC AND CNTitl THEN 

GNT; = CNT-1 

GOTO D 

ELSE CNT: -CNT-1 

CONTINUE 

If CC Is LOW (pass) and the counter is not 

equal to one, decrement the counter and 

branch to the address specified by the D 

inputs. If CC is HIGH (fail) or (he counter is 

equal to one, then decrement the counter and 

continue. The D port must be disabled to avoid 

bus contention. 

IF NOT CC AND CNT # 1 THEN 

CNT: = CNT-1 

GOTO A 

ELSE CNT; = CNT-1 

CONTINUE 

If CC is LOW (pass) and the counter Is not 

equal to one, decrement the counter and 

branch to the address specified by the A Inputs. 

The content of the interrupt return address 

register and the address register is replaced by 

the A address in this case, if CC Is HIGH (fail) 

or the counter is equal to one, the current 

address Is Incremented, appears on the bus for 

continje, and is stored into the atiove two 

registers. 

IF NOT CC AND CNT 9^ 1 THEN 

CNT: = CNT-1 

GOTO Multiway (D16-D4 M3 - Mq) 

ELSE CONTINUE 

If CC is LOW (pass) and the counter Is not 

equal to one, decrement the counter and 

branch to the address specified by the D Inputs 

concatenated with the M inputs. The lower four 

bits on the D bus (D3 - Do) are replaced by one 

of the four sets of the 4-blt muitiway branch 

addresses. The muitiway branch set is selected 

by bits Di and Do while bits D3 and Da are 

"don't cares." 

IF NOT CC AND CNT # 1 THEN 

CNT: = CNT-1 

GOTO TOS 

ELSE CNT: = CNT-1 

POP STACK 

CONTINUE 

If CC is LOW (pass) and the counter is not 

equal to one, decrement the counter and 

branch to the address on the top of the stack. If 

CC is HIGH (fail) or the counter is equal to one, 

then decrement the counter, pop the stack and 

continue. 



Execution Example 



PAND 
COUHTEflii 1 



COUNTER 
0""~ C6UNT-1 



FOR 
COUNTER - 1 



PF001840 



2Eh 

OEh 

1Eh 



RETCC 



RETURN 

Unconditional return from subroutine. The 

return address is popped from the stack. 

IF CC THEN RETURN 

ELSE CONTINUE 

If CC Is HIGH (pass), return from subroutine. 

The return address Is popped from the stack. If 

CC is LOW (fail), continue. 

IF NOT CC THEN RETURN 

ELSE CONTINUE 

If CC is LOW (pass), return from subroutine. 

The return address is popped from the stack. If 

CC is HIGH (fail), continue. 




Note; Opcode numbers are in hexadecimal notation. 
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Opcode 
(I5-I0) 



Mnemonics 



37h 



33h 



FOR D 



FOR A 



LOOP 



34h 

38h 
35h 

39h 
3Ah 



POP_D 

POP_C 
PUSH_D 

PUSH_C 
SWAP 



Description 



INITIALIZE LOOP 

Push the Address Reg. + 1 on the stack, load 

the counter from the D inputs and continue 

Use with DJUMP_S for FOR . . . NEXT loops! 

The D port must be disabled to avoid bus 

contention. 

INITIALIZE LOOP 

Push the Address Reg. + 1 on the stack, load 
the counter from the A inputs and continue. 
Use with DJUMP_S for FOR . . . NEXT loops. 

INITIALIZE LOOP 

Push the Address Reg. + 1 on the stack and 
continue. Use with BRCC_S for 
REPEAT... UNTIL loops, or with XTCC_D 
and BRA_S for WHILE . . . END WHILE loops. 



Execution Example 



Pop the stack and output the value on the D 
outputs and continue. The D port must be 
enabled. 

Pop the stack and store the value in the 
counter and continue. 

Push the D inputs on the stack and continue. 
The D port must be disabled to avoid bus 
contention. 

Push the counter on the stack and continue. 

Exchange the counter and the top of stack and 
continue. 



o O— P 



:r 



— o— . 

COUNTER 



STACK 



5'«f 



STACK 




PFCI01870 



Note: Opcode numbers are in hexadecimal notation. 
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Opcode 
Cs-iQ) 

3Bh 
3Ch 
3Dh 



Mnemonics 



STACK_C 



LOAD_D 



Description 



Push the counter on the stack and load the 

counter with the value of the D inputs and 

continue. 

Load the counter with the value of the D inputs 

and continue. The D port must be disabled to 

avoid bus contention. 

Load the counter with the value of the A inputs 

and continue. 



Execution Example 




4^ — u — 

COUNTER 



COUNTeR 
SO ( I D"*~~ " 

SI® 



30h 
32h 
36h 



CONT 
DECR 
RESET_SP 



Continue. 

Decrement the counter and continue. 

Reset the stack pointer and continue. 



50 1 1 




51® 


K ■ 1 


COUKTEB 


50 I . Q— COUNT-1 

/ 


51«f 


S2 ' 


1 



3Eh 
3Fh 



Load the comparison register with the value of 

the D inputs, enable the comparator and 

continue. 

Disable the comparator and continue. 



COMPARE 

SO 4 O— ■> 

52 I) 



Note; Opcode numbers are in hexadecimal notation. 



PF001900 
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APPLICATIONS 



Internjpt 
Vector 



Address 



Test Am29C331 CP 
Y 



Microprogram 
Memory 



Pipeline Register CP 



Clock 



I I 



Clock 

Am29C332 
Inst. ALU 

Reg. 
Status Y 



Figure 8. Typical Control-Path Architecture For Am29C300 Family 



ALU Status , Am29C331 
Register Output Tesi Inputs 



Am29C331 Outputs 



Microprogram 
MenxJry Outputs 



tOock (0 Register Status Outputs ot the Am29C332) 



3^C 



(Tesi Inputs lo Y Oulputs) 



- Microprogram Mcm^ Access Time— 



3S»( 



Figure 9. Cycle Timing Waveform* 

•This waveform shows the timing reiationship for the configuration shown in Figure 8. 



Register Setup Time 
WF021093 
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ABSOLUTE MAXIMUM RATINGS 

Storage Temperature -65 to +150°C 

(Case) Temperature Under Bias -55 to +125°C 

Supply Voltage to 

Ground Potential Continuous -0.3 V to +7.0 V 

DC Voltage Applied to Outputs For 

High Output State -0.3 V to +Vcc +0.3 V 

DC Input Voltage -0.3 V to +Vcc +0.3 V 

DC Output Current, Into LOW Outputs 30 mA 

DC Input Current -10 mA to +10 mA 

Stresses above those listed under ABSOLUTE MAXIMUM 
RATINGS may cause permanent device failure. Functionality 
at or above these limits is not implied. Exposure to absolute 
maximum ratings for extended periods may affect device 
reliability. 

DC CHARACTERISTICS over operating range unless otherwise specified (for APL Products, Group A, 
Subgroups 1, 2, 3 are tested unless otherwise noted) 



OPERATING RANGES 

Commercial (C) Devices 

Temperature (Ta) to +70°C 

Supply Voltage (Vcc) +4.75 V to +5.25 V 

Military' (M) Devices 

Temperature (Ta) -55 to +125°C 

Supply Voltage (Vcc) +4.5 V to +5.5 V 

Operating ranges define those limits between which the 
functionality of the device is guaranteed. 

•Military Product 100% tested at Ta = +25'C, +125°C, and 
-55°C. 



Parameter 
Symbol 



VOH 



Vol 



V|H 



V|L 



l|L 



l|H 



bZH 



lOZL 



Ice 



CPD 



Parameter 
Description 



Output HIGH Voltage 



Output LOW Voltage 



Guaranteed Input Logical 
HIGH Voltage (Note 2) 



Guaranteed Input Logical 
LOW Voltage (Note 2) 



Input LOW Current 



Input HIGH Current 



Off-State (HIGH ]mpe^0Sg, 
Output Current _^^.^ ^^k^^M 



Off-State (HIQtfJm^acjIe) 
Output Current -ifp-^^ ''^0^ 



Static Power Supply Current 
(Note 3) 



Power Dissipation Capacitance 
(Note 4) 



Test Conditions (Note 1) 



Vcc ~ f^i"- 
V|N-V|H or V|L 



VCO' 
V|N- 



- Min. 

V|H or V|L 



Iqh = 0.4 mA 



lOL = 8 mA ttuJVSSUS "'iJj 
= 4 jnA % /Mfyglher Nris 



Vcc'Maxi;. '-X 
V|N = 0.5 V*s ■ 



.^cfWiMax. ''-i, ''-ft 

' v«i-ltec-J 9"V 



SS'cg.Bl^aK. 

■9^y^.4 Volts 



Vcc = Max- 
Vo = 0.5 Volts 



Vcc = Max., 

V|N - Vcc or <3ND, 

Io = (lA 



29C331 



29C331-1/-2 



29C331 only 



Vcc - 6.0 V 
Ta = 25°C 
No Load 



Min. 



Max. 



Unit 



mA 



ma 



ka 



UA 



pF Typical 



Notes: 1. Vcc conditions shown as Min. or Max. refer to tfie commercial and military Vcc li^iits. 

2. Tf>ese input levels provide zero-noise immunity and siiould only be statically tested in a noise-free environment (not functionally tested). 

3. Worst-case Ice is measured at tfie lowest temperature in the specified operating range. 

4. Cpn determines the no-load dynamic current consumption: ..,,.,. , , 

Ice (Total) = ice (Static) + Crd Vcc f. where f is the switching frequency of the majority of the internal nodes, normally one-half of the clocK 
frequency. This specification is not tested. 
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SWITCHING CHARACTERISTICS over COMMERCIAL operating range 

A. COMBINATIONAL PROPAGATION DELAYS 



No. 


From 


To 


29C331 


29C331-1 


29C331-2 


Unit 


Max. Delay 


Max. Delay 


Max. Delay 


1 


DiS-O 


V15-O 


22* 


20* 


18 


ns 




Dl5-0 


EQUAL 


32 


28 


23 


ns 




Dl5-0 


ERROR 


36 


32 


26 


ns 


2 


Al5-0 


Y15-O 


20 


18 


16 


ns 




Al5-0 


EQUAL 


31 


27, .- 


?2, , 


ns 




^15-0 


ERROR 


33 


29 >■ 


24, -■ 


ns 


3 


Mx3-X0 


V16-O 


19 


ia- VI 


1S<^., 


ns 




MX3 - XO 


EQUAL 


29 


26.-,, 


21-V. 


ns 




MX3 - XO 


ERROR 


33 


29 S: 


24 ' 


ns 




Yl5-0 


EQUAL 


31 


26.-,;.' 


23. . 


ns 




Y15-0 


ERROR 


26 


^'>,-., 


<. 


ns 


4 


I5-O 


Y31~0 


24 


£2 


18 


ns 


5 


I5-O 


D15-O 


29 


26...,' 


at-,.. 


ns 




I5-O 


EQUAL 


36 


«3 


27 


ns 




I5-O 


ERROR 


40 


36.< 


§»n.^' 


ns 


6 


T11-O 


Y15-O 


24 


2a \ 


feZ 


ns 




Tn-o 
T11-0 
S3-0 


EQUAL 
ERROR 
Y15-O 




32 "• 


26 ■ 
1&.i.f 


ns 
ns 
ns 




Ss-O 


EQUAL 


as . ■^ 


3? 'V- 


a6«' 


ns 




S3-0 


ERROR 


97 -'cw; 


33 r*~ 


27 ji-;. 


ns 


7 


CP 


Y15-O 


^if 


25' 


20 J 


ns 


8 


CP 


D15-O 


2s/:^- 


3or^'- 


ns 


9 


CP 


A-FULL 


27 'H 


24^", 


»*-: 


ns 




CP 


EQUAL 


50l>^ 


32 3 


26 :. 


ns 




CP 


ERROR 


49.1*'- 


3»j»e"' 


ns 


10 


RST 


Y15-O 


26/Si«'> 


24/3 


20/Z 


ns 




HSI 


D15-0 


''Z-j«* 


'zk-, 


^■>, 


ns 


11 


RST 


INIA 


aaws 


J-??-^ 




ns 




RST 


EQUAL 


1> 


ns 




rSt 


ERROR 


a%.^,, 


3^a.'!!. 


ns 


12 


FC 


Y15-O 


^'t.*. 


23:., 


ly- 


ns 


13 


FC 


D15-O 


28"*'- 


25 " 


ns 




FC 


EQUAL 


33 ' 


30 . 


24 


ns 




FC 


ERROR 


3S 1 - 


31-: i 


25 > :, 


ns 




intr 


Y1S-0 


Z 


Z i-v ■ 


z- ' 


ns 


14 


intr 


INTA 


'■''^' 


16 . 


9 


ns 




intr 


EQUAL 


(Nole ■IT' 


(Note 1)i 


(Note 1)'. 


ns 




INTR 


ERROR 


ie. »-■ 


2t , 


18 


ns 




INTEN 


Y15-O 


2 -%. i 


^ ™ 


Z ., , 


ns 


15 


INTEN 


INTA 


ig--- 


'5 " 


9 ,"" 


ns 




INTEN 


EQUAL 


(Note ,1) 


{Nole «> 


(N«»-t>.. 


ns 




INTEN 


ERROR 


'^:-.. 


21 


ia . 


ns 




HOLD 


Y15-O 


z 


Z - 


^..l:' 


ns 




HOLD 


INTA 


z 


z ■' - 


z-' 


ns 




HOLD 


A-FULL 


z 


z 


z 


ns 




HOLD 


EQUAL 


34/Z 


31/Z- 


17/^ '■ 


ns 




HOLD 


ERROR 


46 


18 


17 


ns 




OED 


D15-O 


Z 


■ir'-- 


,2' ' 


ns 




OED 


ERROR 


19 


z ,. 


17 


ns 




INTA 


ERROR 


19" 


1/""- 


17 "' 


ns 




A-FULL 


ERROR 


21" 


20"' 


%7 


ns 




EQUAL 


ERROR 


19" 


Y7*« 


17 


ns 


16 


Cjn 


Y15-O 


24 


21 


18 


ns 




Cin 


EQUAL 


36 


33 


20 


ns 




Cin 


ERROR 


37 


33 


21 


ns 




SLAVE 


Y16-O 


Z 


Z 


Z 


ns 




SLAVE 


D1S-0 


z 


z 


Z 


ns 




SLAVE 


INTA 


z 


z 


z 


ns 




SLAVE 


A-FULL 


z 


z 


z 


ns 




SLAVE 


EQUAL 


z 


z 


z 


ns 



Notes: See notes following Table D. 

"This includes using D as select lines for multiway sets. 
**ln the slave mode. 
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SWITCHING CHARACTERISTICS over COMMERCIAL operating range (Cont d.) 



B. OUTPUT DISABLE TIME 



No. 



43 
44 



From 



RST 

RST 

INTR 

INTR 

INTEN 

INTEN 

HOLD 

HOLD 

SLAVE 

SLAVE 

OED 

OED 

^T 

RST 

SLAVE 

SLAVE 

CP 

CP 

HOLD 

HOLD 

HOLD 

HOLD 

HOLD 

HOLD 

SLAVE 

SLAVE 

SLAVE 

SLAVE 

SLAVE 

SLAVE 



To 



Description 



Reset-to-Address Enabie 
Reset-to-Address Disabie 
iNTR-to-Address Enable 
iNTR-to-Address Disable 
iNTEN-to-Address Enable 
iNTEN-to-Address Disable 
HOLD-to-Address Enable 
HOLD-to-Address Disable 
SLAVE-to-Address Enable 
SLAVE-to-Address Disable 
OED-to-Data Enable 
OED-to-Data Disable 
Reset-to-Data Enable 
Reset-to-Data Disable 
SLAVE-to-Data Enable 
SLAVE-to-Data Disable 
Ciocl<-to-Data Enable 
Clocl^-to- Data Disable 
HOLD-to- INTA Enable 
HOLD-to-INTA Disable 
HOLD-to-A-FULL Enable 
HOLD-to-A-FULL Disabie 
HOLD-to-EQUAL Enable 
HOLD-to- EQUA L Disable 
SLAVE-to- INTA Enable 
SLAVE-to-INTA Disabie 
SLAVE-to-A-FULL Enable 
SUVVE-to-A-FULL Disable 
SLAVE-to-EQUAL Enable 
SLAVE-to-EQUAL Disable 



29C331 



Max. Value 



29 
29 
■ 24 
24 ' 
2A 

■ 24 

- '^- 
.■.^■ 

24: 
" ^- 

-'"»"; 

' 35;." 

22,:: 

■ -'22 ■ 
"' -21 ■ 

--. 2r:- 
... ,21 

■ s'V 

-■-.-22. 

-■...22 

::- ,22. 

^ ■22- 
22 
22 



29C331-1 



Max. Value 



25 
25 . 
21 
21 

-.at.. 

:S1. . 
20 

'2&-' 

; -^-.f ■..■ 

21 

'■■. sz ;• 

.'-.22. ^ 
23-,. 

■ 23 ■", 

"•■■aa.-"- 
.■■22.; 
.-24 

- 24 '-' 

- 19.-' 

- ts; , 

. 18' 

""'is 
.'■.,19 ;■ 

• -19-'-; 
-:ia' ' 

■ 1*. 
19 
19 



290331-2 



Max. Value 



25 

25 ■• 

21 

21 
' 21 ., 
.21 . 

2a' 

2a . 
-2.1.-: 

21 

22 . 
;'22 ;: 

"23 " 
,.23 ■■'. 
■,£2.;- 
,-22".. 

24-- 
■--24'... 

IS, -: 
,"."19.-' 

18' ' 

-■•ta . 
•,.te- ■■■ 

.18-. 
■"19", 

15,.- 

19' - 
19 



Unit 



ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 



Notes: See notes following Table D, 
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SWITCHING CHARACTERISTICS over COMMERCIAL over operating range (Cont'd.) 



C. SETUP AND HOLD TIMES 



No. 



17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
31 
32 
33 
34 
35 
36 
37 
38 
39 
40 
41 
42 



Parameter 



Data Setup 
Data Hold 

Alternate Data Setup 
Alternate Data Hold 
Multlway Setup 
Multiway Hold 
Address Setup 
Address Hold 
Instruction Setup 
Instruction Hold 
Forced Continue Setup 
Forced Continue Hold 
Test Setup 
Test Hold 
Select Setup 
Select Hold 
Reset Setup 
Reset Hold 

Interrupt Request Setup 
Intermpt Request Hold 
Interrupt Enable Setup 
Interrupt Enable Hold 
Hold Mode Setup 
Hold Mode Hold 
Carry-In Setup 
Carry-In Hold 



For 



Dl5-0 

Dl5-0 

Al5-0 

Al5-0 

MX3 - XO 

MX3-X0 

Yl5-0 

Yl5-0 

I5-0 

I5-O 

FC 

FC 

T11-0 

Til -0 

S3-0 

Sgi-o 

RST 

RST 

INTR 

INTR 

INTEN 

INTEN 

HOLD 

HOLD 

Cjn 

Qn 



With Respect 
To 



CP t 
CP t 

CP T 
CP T 

CP t 
CP t 

CP T 
CP T 
CP T 
CP t 
CP T 
CP T 
CP T 
CP t 
CP t 
CP t 
CP T 
CP T 
CP t 
CP t 
CP T 
CP T 
CP t 
CP T 
CP T 
CP t 



29C331 



Max. Value 



22„ 



29C331-1 



Max. Value 



'%l 






21 






20 

M 

18 



18^ 






29C331-2 



Max. Value 



1 


21*Mfr'' 



■iMnmm 
21 



20 



1.8 



0^ 



Unit 



ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 
ns 



D. MINIMUM CLOCK REQUIREMENT 



No. 



53 

54 



Description 



Minimum Clock LOW Time 
Minimum Clock HIGH Time 



Mav^mue 



M 



"M 



£#23 
^ 19 



29C331-1 



Max. 



1^ 



^ 



290331-2 



Max.^lMie 



16 



Unit 



ns 
ns 



Notes: 1. (INTR, INTEN)-to-EQUAL is the sum of (INTR, INTEN)-to-Y disable time and Y-to-EQUAL delay 
time. 
2. Cl = 50 pF; Cl = 5 pF for Disable Time only. 
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SWITCHING CHARACTERISTICS over MILITARY operating range (for APL Products, Group A, Subgroups 
9, 10, 11 are tested unless otherwise noted) 

A. COMBINATIONAL PROPAGATION DELAYS 



No. 


From 


To 


29C331 


Unit 


Max. Delay 


1 


Dl5-0 


Y15-O 


30* 


ns 




Dl5-0 


EQUAL 


48 


ns 




Dl5-0 


ERROR 


29" 


ns 


2 


Ais-o 


Y15-O 


27 


ns 




Al5-0 


EQUAL 


44 


ns 




Al5-0 


ERROR 


50 


ns 


3 


Mx3-X0 


Y15-O 


30 


ns 




MX3-X0 


EQUAL 


48 


ns 




MX3-X0 


ERROR 


55 


ns 




Y15-0 


EQUAL 


41 


ns 




Y15-0 


ERROR 


29" 


ns 


4 


I5-O 


Y31-O 


32 


ns 


5 


I5-O 


D15-O 


37 


ns 




I5-O 


EQUAL 


48 


ns 




I5-O 


ERROR 


55 


ns 


6 


T1I-O 


Y15-O 


^2 * 


ns 




T11 -0 


EQUAL 


48 * 


ns 




T1I-O 


ERROR 




ns 




S3-O 


Y15-O 


' ns 




S3-O 


EQUAL 


'vmt 




S3-0 


ERROR 


»£j|P*l^- 


ns 


7 


CP 


Y15-O 


j^ST^^ 


ns 


8 


CP 


D15-O 


m'^'^18. 


ns 


9 


CP 


A-FULL 


S^k ^f#^" 


ns 




CP 


EQUAL 




ns 




CP 


ERROR * 


ns 


10 


R§T 


Y15-O %1 


W ^2/Z 


ns 




RgT 


°^^-°^% 


y ^ 


■ ns 


11 


RST 


iNTA^»3! 


> 22 


ns 




BST 


EQi|fc,V%'' 


48 


ns 




RST 


WhS?^' 


55 


ns 


12 


FC 


Yi|^% 


32 


ns 


13 


FC |^~ 


37 


ns 




m 


48 
55 


ns 
ns 


14 


INTR ^^ 


TFTTX 


Z 

21 


ns 
ns 




s^ 


EQUAL 


(Note 1) 


ns 




ERROR 


49 


ns 


15 % 


Y15-O 


Z 


ns 


aiWEN 

%IEN 


INTA 


21 


ns 




EQUAL 


(Note 1) 


ns 




inYen 


ERROR 


49 


ns 




HOLD 


Y15-O 


Z 


ns 




HOLD 


INTA 


Z 


ns 




HOLD 


A-FULL 


21 /Z 


ns 




HOLD 


EQUAL 


43/Z 


ns 




HOLD 


ERROR 


49 


ns 




OED 


D15-O 


26 


ns 




OED 


ERROR 


Z 


ns 




INTA 


ERROR 


29" 


ns 




A-FULL 


ERROR 


29" 


ns 




EQUAL 


ERROR 


29" 


ns 


16 


S 


Y15-O 


32 


ns 




EQUAL 


48 


ns 




m^ 


ERROR 


55 


ns 




SLAVE 


Y16-O 


Z 


ns 




SLAVE 


D15-O 


Z 


ns 




SLAVE 


ii?rA 


z 


ns 




SLAVE 


A-FULL 


z 


ns 




SLAVE 


EQUAL 


z 


ns 



Notes: See notes following Table D. 

•This includes using D as select lines for multiway sets. 
"In the slave mode. 
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SWITCHING CHARACTERISTICS over MILITARY operating range (Cont'd.) 






B. OUTPUT DISABLE TIME 






No. 


From 


To 


Description 


29C331 


Unit 


Max. Value 




RST 


Y15-O 


Reset-to-Address Enable 


26 


ns 




RST 


V1S-0 


Reset-to-Address Disable 


26 


ns 


43 


INTR 


Y15-O 


INTR -to- Address Enable 


26 


ns 


44 


INTR 


Y15-O 


INTR-to-Address Disable 


26 


ns 




INTEN 


Y1S-0 


INTEN-to-Address Enable 


26 


ns 




INTEN 


Y15-O 


INTEN-to-Address Disable 


26 


ns 




HOLD 


Y16-O 


HOLD-to-Address Enable 


26 


ns 




HOLD 


Y15-O 


HOLD-to-Address Disable 


26 


ns 




SLAVE 


Y15-O 


SLAVE-to-Address Enable 


26 


ns 




SLAVE 


Y15-O 


SLAVE-to-Address Disable 


26 


ns 




OED 


Y15-O 


OED-to-Data Enable 


26 


ns 




GEO 


D15-O 


OED-to-Data Disable ' 


26 


ns 




RST 


D15-O 


Reset-to-Data Enatile 


26 


ns 




RST 


D15-O 


Reset-to-Data Disable ' 


26 


ns 




SLAVE 


D15-O 


SLAVE-to-Data Enatge 


26 


ns 




SLAVE 


D15-O 


Sj..AVE4i?-Data Disable 


26 


ns 




CP 


Di§-o t0f% 


aocis4o|3|««a Enable 
qK:k-te-Bata Disable 


23 


ns 




CP 


23 


ns 




HOLD 


\NTfi.:i.», ""ft*** 


|H#&to-INIA Enable 


21 


ns 




HOLD 


iNfr*'I % 


^llOLD-to-INTA Disable 


21 


ns 




HOLD 


A-FMf' ** 


HOLD-to-A-FULL Enable 


21 


ns 




HOLD 


A-FL/%i 


HOLD-to-A-FULL Disable 


21 


ns 




HOLD 


EQUAL 


HOLD-to-EQUAL Enable 


21 


ns 




HOLD 


EQUAL 


HOLD-to-EQUAL Disable 


21 


ns 




SLAVE 


INTA 


SLAVE-to-INTA Enable 


21 


ns 




SLAVE 


INTA 


SLAVE-to-INIA Disable 


21 


ns 




SLAVE 


A-FULL 


SLAVE-to-A-FULL Enable 


21 


ns 




SLAVE 


A-FULL 


SLAVE-to-A-FULL Disable 


21 


ns 




SLAVE 


EQUAL 


SLAVE-to-EQUAL Enable 


21 


ns 




SLAVE 


EQUAL 


SLAVE-to-EQUAL Disable 


21 


ns 


Notes: See notes following Table D. 
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SWITCHING CHARACTERISTICS over MILITARY operating range (Cont'd.) 



C. SETUP AND HOLD TIMES 



No. 



17 

18 

19 

20 

21 

22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

41 

42 



No. 



Parameter 



For 



53 
54 



Data Setup 
Data Hold 

Alternate Data Setup 
Alternate Data Hold 
Multiway Setup 
Multiway Hold 
Address Setup 
Address Hold 
Instruction Setup 
Instruction Hold 
Forced Continue Setup 
Forced Continue Hold 
Test Setup 
Test Hold 
Select Setup 
Select Hold -.>■,%, 

Reset Setup .!' '?? '-^ 
Reset Hold ''V'.--'" 
Interrupt Request Setup 
Interrupt Request Hold 
Interrupt Enable Setup 
Interrupt Enable Hold 
Hold Mode Setup 
Hold Mode Hold 
Carry-In Setup 
Carry-In Hold 



Di5-0 

Dl5-0 
Al5-0 
Al5-0 
MX3 - XO 
MX3 - XO 
Yl5-0 
Yi5-0 
I5-0 

I5-0 ,1 
FC ^,. 1 

'^'^"«, %:% 

T11%0%, 

>Jii.,A ■ 

S3- O 

RST 

RSI 

INTR 

INTR 

INTEN 

INTEN 

HOLD 

HOLD 

Cjn 

Cin 



With Respect To 



cp T 
CP t 
CP T 
CP T 
CP T 
CP ?■ 
CP f-.- 
CP'-T 
CP.;--r- 
■-:>CP T 
^CP t 
CP t 
CP T 
CP T 
CP T 
CP T 
CP t 
CP t 

CP T 

CP t 

CP T 

CP T 

CP T 

CP T 

CP t 

CP T 



29C331 



Max. Value 



D. MINIMUM CLOCK REQUIRERIENTS 



Minimum Clock LOW Time 
Minimum Clock HIGH Time 



29C331 



Max. Value 



33 
28 



Unit 



32 


ns 


1 


ns 


32 


ns 


1 


ns 


32 


ns 


1 


ns 


27 


ns 


2 


ns 


32 


ns 





ns 


32 


ns 


1 


ns 


32 


ns 





ns 


32 


ns 





ns 


32 


ns 


1 


ns 


27 


ns 


1 


ns 


27 


ns 


1 


ns 


27 


ns 


1 


ns 


30 


ns 


1 


ns 



Unit 



ns 

ns 



Notes: 1. (INTR, lNTEN)-to-EQUAL is the sum of (INTR, INTEN)-to-Y disable time and Y-to-EQUAL delay 
time. 

2. Cl = 50 pF; Cl = 5 pF for Disable Time only. 

3. The status of I5-I0 and FC must not be changed during the clock LOW time. 
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SWITCHING TEST CIRCUIT 



VOUT 




Rl=240 



TC003420 



A. Three-State Outputs 



Notes: 1. Cl = 50 pF includes scope probe, wiring, and stray capacitances witliout device in test fixture. 

2. Si, Sa, S3 are closed during function tests and all AC tests except output enable tests. 

3. Si and S3 are closed while S2 is open for tpzH test. 
Si and $2 are closed while S3 is open for tpzL test. 

4. Cl = 5.0 pF for output disable tests. 
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SWITCHING TEST WAVEFORMS 



DATA \M 


/WW 






\ 


/WWW ;-v 


LOW HIGH -LOW 


f \ 




«Pur ^ 


WW 




-•h 




Iaaama :: 


PULSE / 




^ 




1— ts- 










- 






riMtNG 




r 








HIGH-LOWHIGH \ 


/ 


INPUT 




/ 


I 










k / 
















WFR02970 


Pulse Width 


WFR02790 



Notes: 1. Diagram shown for HIGH data only. Output 
transition may be opposite sense. 
2. Cross hatched area is don't care condition. 

Setup, Hold, and Release Times 



-/=^ 



-Jf=^ 



OPPOSITE PHASE 
INPUT TRANSITION ' 



\=iz£ 



- 3 V 

- 1.5 V 

■ V 

Vqh 

1.SV 

Vol 

■ 3 V 

- 1.5 V 

- V 



OUTPUT 

NORMALLY 
LOW 



S3 OPEN ' \ 



OUTPUT 

NORMALLY 

HIGH SjOPEN 



/ V 



■ 3 V 
- 1.5 V 



U.ir V 

J' rVQ 



5 V 

Vol 



^ 



0.5 V 

WFR02663 



Propagation Delay 



Notes: 1. Diagram shown for Input Control Enable-LOW 
and Input Control Disable-HIGH. 
2. Si, S2, and S3 of Load Circuit are closed 
except where shown. 

Enable and Disable Times 
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Test Philosophy and Methods 

The following points give the general philosophy that we apply 
to tests that must be properly engineered if they are to tie 
Implemented in an automatic environment. The specifics of 
what philosophies applied to which test are shown. 

1. Ensure the part is adequately decoupled at the test head. 
Large changes in supply current when the device switches 
may cause function failures due to Vcc changes. 

2. Do not leave inputs floating during any tests, as they may 
oscillate at high frequency. 

3. Do not attempt to perform threshold tests at high speed. 
Following an input transition, ground current may change by 
as much as 400 mA in 5 - 8 ns. Inductance in the ground 
cable may allow the ground pin at the device to rise by 
hundreds of millivolts momentarily. Current level may vary 
from product to product. 

4. Use extreme care in defining input levels for AC tests. Many 
inputs may be changed at once, so there will be significant 
noise at the device pins which may not actually reach Vil or 
V|n until the noise has settled. AMD recommends using 
V|L<0 V and V|H >3 V for AC tests. 

5. To simplify failure analysis, programs should be designed to 
perform DC, Function, and AC tests as three distinct groups 
of tests. 

6. Capacitive Loading for AC Testing 

Automatic testers and their associated hardware have stray 
capacitance which varies from one type of tester to 
another, but is generally around 50 pF. This makes it 
impossible to make direct measurements of parameters 
which call for a smaller capacitive load than the associated 
stray capacitance. Typical examples of this are the so- 
called "float delays," which measure the propagation 
delays into and out of the high-impedance state, and are 
usually specified at a load capacitance of 5.0 pF. In these 
cases, the test is performed at the higher load capacitance 
(typically 50 pF), and engineering correlations based on 
data taken with a bench setup are used to predict the re- 
sult at the lower capacitance. 

Similarly, a product may be specified at more than one 
capacitive load. Since the typical automatic tester is not 
capable of switching loads in mid-test, it is impossible to 
make measurements at both capacitances even though 
they may both be greater than the stray capacitance. In 



these cases, a measurement is made at one of the two 
capacitances. The result at the other capacitance is 
predicted from engineering correlations based on data 
taken with a bench setup and the knowledge that certain 
DC measurements (Iqh, Iol. 'or example) have already 
been taken and are within specification. In some cases, 
special DC tests are performed in order to facilitate this 
correlation. 

7. Threshold Testing 

The noise associated with automatic testing, the long 
inductive cables, and the high gain of bipolar devices when 
in the vicinity of the actual device threshold frequently give 
rise to oscillations when testing high-speed circuits. These 
oscillations are not indicative of a reject device, but instead, 
of an overtaxed test system. To minimize this problem, 
thresholds are tested at least once for each input pin. 
Thereafter, "hard" high and low levels are used for other 
tests. Generally this means that function and AC testing are 
performed at "hard" input levels rather than at Vm max. 
and V|H min. 

8. AC Testing 

Occasionally parameters are specified that cannot be 
measured directly on automatic testers because of tester 
limitations. Data input hold times often fall into this catego- 
ry. In these cases, the parameter in question is guaranteed 
by correlating these tests with other AC tests that have 
been performed. These correlations are arrived at by the 
cognizant engineer using data from precise bench meas- 
urements in conjunction with the knowledge that certain DC 
parameters have already been measured and 
are within specifk:ation. 

In some cases, certain AC tests are redundant since they 
can be shown to be predicted by other tests that have 
already been performed. In these cases, the redundant 
tests are not performed. 

9. Output Short-Circuit Current Testing 

When performing Iqs tests on devices containing RAM or 
registers, great care must be taken that undershoot caused 
by grounding the high-state output does not trigger parasit- 
ic elements which in turn cause the device to change state. 
In order to avokl this effect, it is common to make the 
measurement at a voltage (Voutput) that is slightly above 
ground. The Vcc is raised by the same amount so that the 
result (as confirmed by Ohm's law and precise bench 
testing) is identical to the Vqut = 0, Vcc = Max. case. 



SWITCHING WAVEFORMS 
KEY TO SWITCHING WAVEFORMS 



WAVEFORM INPUTS 



HfLLBE 
CHANGING 
FROM H TO L 



»*v CHANGE ^h';'"n^^ 
FBOMCTOH pTOMLTO^ 



1>1>NT CAHE; CHANGING 

AHV CHAKCE STATE 

PERMITTED UNKNOWN 



TO (JJ 



KS000010 
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SWITCHING WAVEFORMS (Cont'd.) 




INPUT 
_ TO _ 
CLOCK OUTPUT 
TO DELAY 
OUTPUT 
DELAY 





/ 


« 


-Cl 


CLEl- 

^ 


— ► 


* CYCLE 2 »■ 










CLOCK 




' \ / 




\ 


/ 


"A 


HOLD \ 


^1 










* cc 








RESET r 


CC 








INTEN r 






♦-(Nolel) 














iNTn 


; 


f 




\ 

-* 














H— ® 




flJlA 


@-* 


\ 


r 


*-© 












r 


J 


- 






Y 


^»^ 


Vf 


f ^» 






■"-? 






»- {NOI6 2) 




INT-VECT BUFFER 


VECToff -i- 


VECToN -Jf- VECTofF 






♦- 


-@H 




ADDRESS REGISTER 


A-, ^ 


^ B 


^ 


B*1 


^ 


B«2 










MTERRUPT RETURN 
ADDRESS REGISTER 


A., ^ A 


^ A. 


\ 


B*1 


^ 


Bt2 


(Note 3) 





























Interrupt Timing 

Notes: 1. Interrupt Request comes from an interrupt-controHer register. If reflects the CP t to INTR time of 
the interrupt controller. 

2. During Cycle 2, there may be contention on the Y-bus if the Y-bus is turned ON before the INT- 
VECT buffer is turned OFF. 

3. Refer to Figures 4 and 5 for definition of A and B. 
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SWITCHING WAVEFORMS (Cont'd.) 



CP. 



jr 



\ 



RST 



~\ 



;'■* — ® — *{■* — ® — *■; 

/ 



<1fl) >> 



X 



_ '" ® — ^ 

INTA yf 



Reset Timing 



WF024770 



IFJTa. 



-0- 



-0- 



X 



-0 — \ 



X 



X 



^0- 



X 



-0- 



tm± 



X 



i r'^i (^ 



X 



X 



-0-1 

zx 



X 



\* — (Tt) — >t< •*— 

XUIXZ 



M 9 ) K< ►! — f 20 ) 



xzzpk: 



j< — (20 — "r* — ►! — (jy 



H — @ — i* — ^~@ 



X 



XIX 



X 



X 



i-0 



xizrx 



0- ii — ^ © .U— >> 0->i 

0-15=4-0 ^0_4^0_ 

-Un ^ — i_ 



^ 



X I x_ 

xi^: 



WF025320 
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SWITCHING WAVEFORMS (Cont'd.) 



cu< 



-®- 



-@- 



RST 



/ 



HOLD 



Y 

IfJTA 
A-FUa 
EQUAL 




@@@@ 



@@@© 




3C 



Am29331 Hold Timing 



INPUT/OUTPUT CIRCUIT DIAGRAM 
(All Devices) 



DRIVING OUTPUT 



DRIVEN INPUT 



'oh 



< 
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X 
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44- 



*? 



< 
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Am29C332 

CMOS 32-Bit Arithmetic Logic Unit 



ADVANCE INFORMATION 



DISTINCTIVE CHARACTERISTICS 



Single Chip, 32-Bit ALU 

Standard product supports 110 ns microcycle time 

for the 32-bit data path. It is a combinatorial ALU 

with equal cycle time for all instructions. 

Speed Select supports 80-ns system cycle time 

Flow-through Architecture 

A combinatorial ALU with two input data ports and 

one output data port allows implementation of either 

parallel or pipelined architectures. 

64-Bit In, 32-Bit Out Funnel Shifter 

This unique functional block allows n-bit shift-up, 

shift-down, 32-bit barrel shift or 32-bit field extract. 



Supports All Data Types 

It supports one-, two-, three- and four-byte data for 
all operations and variable-length fields for logical 
operations. 

Multiply and Divide Support 
Built-in hardware to support two-bit-at-a-time modi- 
fied Booth's algorithm and one-bit-at-a-time division 
algorithm. 

Extensive Error Checldng 
Parity check and generate provides data transmis- 
sion check and master/slave mode provides com- 
plete function checking. 



> 

3 
to 

(O 

O 

w 
w 



GENERAL DESCRIPTION 



The Am29C332 is a 32-bit wide non-cascadable Arithmetic 
Logic Unit (ALU) with integration of functions that normally 
don't cascade, such as barrel shifters, priority encoders 
and mask generators. Two input data ports and one output 
data port provide flow-through architecture and allow the 
designer to implement his/her architecture with any degree 
of pipelining and no built-in penalties for branching. Also, 
the simplicity of a three-bus ALU allows easy implementa- 
tion of parallel or reconfigurable architectures. The register 
file is off-chip to allow unlimited expansion and regular 
addressability. 

The Am29C332 supports one-, two-, three- and four-byte 
data for arithmetic and logic operations. It also supports 



multiprecision arithmetic and shift operations. For logical 
operations, it can support variable-length fields up to 32 
bits. When fewer than four bytes are selected, unselected 
bits are passed to the destination without modification. The 
device also supports two-bit-at-a-time modified Booth's 
algorithm for high-speed multiplication and one-bit-at-a- 
time division. Both signed and unsigned integers for all byte 
aligned data types mentioned above are supported. 

The Am29C332 is designed to support 110-ns microcycle 
time standard speed, and 80-ns microcycle time with speed 
select. The device is packaged in a 169-lead pin-grid-array 
package. 



SIMPLIFIED BLOCK DIAGRAM 
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RELATED AMD PRODUCTS 



Part NO. 


Description 


Am29C01 


CMOS 4-Bit Microprocessor Slice 


Am29C10A 


CMOS 12-Bit Sequencer 


Am29C101 


CMOS 16-Bit Microprocessor 


Am29112 


8-Bit Cascadable Microprogram Sequencer 


Am29114 


Real-Time Interrupt Controller 


Am29C116 


CMOS 16-Bit Microcontroller 


Ann29C323 


CMOS 32 X 32- Parallel Multiplier 


Am29325 


32-Bit Floating Point Processor 


Am29C325 


CMOS 32-Bit Floating Point Processor 


Am29331 


16-Bit Microprogram Sequencer 


Am29C331 


CMOS 16-Bit Microprogram Sequencer 


Am29334 


64x18 Four-Port, Dual-Access Register File 


Am29C334 


CMOS 64x18 Four-Port, Dual-Access Register File 


Am29337 


16-Bit Bounds Checker 


Am29338 


32-Bit Bvte Queue 


Am29C516 


CMOS 16x16 Multiplier 


Am290517 


CMOS 16x16 Multiplier with Separate I/O 



CONNECTION DIAGRAM 
169-Lead PGA 
Bottom View 
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VCC 


DA30 


0A2S 


0B31 


0A31 
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PIN DESIGNATIONS 

(Sorted by Pin No.) 


PIN NO 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


A-1 


DBs 


1 


C-9 


W3 


145 


J-15 


GND 


105 


R-10 


Y31 


66 


A-2 


DA5 


164 


C-10 


lo 


139 


J-16 


Y5 


101 


R-11 


GND 


64 


A-3 


DB4 


161 


C-11 


GND 


143 


J-17 


Y4 


102 


R-12 


Vcc 


71 


A-4 


DB2 


157 


C-12 


I5 


134 


K-1 


DB16 


27 


R-1 3 


Y25 


74 


A-5 


DBi 


155 


C-13 


CP 


130 


K-2 


PA, 


25 


R-14 


GND 


79 


A-6 


DBo 


153 


C-14 


SLAVE 


127 


K-3 


DA15 


24 


R-15 


Yi9 


82 


A-7 


Pl 


148 


C-15 


N 


120 


K-15 


Y7 


99 


R-16 


Yi5 


88 


A-8 


P2 


149 


C-16 


L 


118 


K-16 


Ye 


100 


R-17 


Yi4 


89 


A-9 


W2 


142 


C-17 


GND 


117 


K-17 


GND 


98 


T-1 


DA23 


42 


A-10 


l2 


137 


D-1 


DBb 


7 


L-1 


PBi 


26 


T-2 


DB23 


41 


A-11 


I3 


136 


D-2 


PBo 


6 


L-2 


DA16 


28 


T-3 


DA24 


46 


A-12 


l6 


133 


D-3 


PAo 


5 


L-3 


Vcc 


22 


T-4 


DA25 


48 


A-13 


Is 


131 


D-15 


C 


119 


L-15 


Vcc 


103 


T-5 


DA27 


52 


A-14 


MLINK 


129 


D-16 


Vcc 


116 


L-16 


Vcc 


103 


T-6 


DA2e 


54 


A-1S 


M/m 


125 


D-17 


PYo 


115 


L-17 


Vcc 


103 


T-7 


DA30 


58 


A-16 
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124 


E-1 


DBg 


9 


M-1 


DB18 


31 


T-8 


DA31 


60 


A-17 


HOLD 


123 


E-2 


DA9 


10 


M-2 


DA, 7 


30 


T-9 


PA3 


61 


B-1 


DAe 


2 


E-3 


DAb 


8 


M-3 


DB17 


29 


T-10 


Y30 


67 
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DB5 
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E-15 
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Ys 


96 


T-11 
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70 
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DAa 
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E-16 


PYi 


114 
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Y11 


93 


T-12 


GND 


72 
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DAa 
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E-17 


Yo 
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Y9 


95 


T-13 
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76 


B-S 


DAi 
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11 
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33 
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78 


B4 
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1 
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13 
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34 
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Pa 
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32 


T-16 


Y18 


83 
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Po 
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F-15 
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113 


N-15 


Y12 


92 


T-17 
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86 


B-9 


Wi 
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F-16 


GND 


110 


N-16 
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94 


U-1 


PA2 


43 


B-10 


Wo 


140 


F-17 


PERR 
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N-17 


Vcc 


97 


U-2 


PB2 


44 


B-11 


I1 


138 
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P-1 


DB20 


35 


U-3 


DB24 


45 


B-12 


I4 


135 
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16 
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DA20 
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49 


B-13 


I7 
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RS 


128 


G-15 


GND 


104 


P-15 


Oe-Y 


87 


U-6 


DB29 


55 


B-1S 


MCin 


126 


G-16 


GND 


104 


P-16 


Y13 


90 


U-7 


DA29 


56 


B-16 


V 


121 


G-17 


GND 


104 


P-17 


GND 


91 


U-8 


DB30 


57 


B-17 


Z 


122 


H-1 


OB12 


15 


R-1 


DB22 


39 


U-9 


PB3 


62 


C-1 
C-2 


DB7 
DA7 


3 
4 


H-2 

H-3 


DA13 


18 


R-2 


DA21 


38 


U-10 


Y28 


69 


C-3 


DA4 


162 


H-15 


Y3 


106 


R-4 


DA22 
DBjs 


40 
47 


U-11 
U-12 


Y29 
Y26 


68 
73 


C-4 


DB3 


159 


H-16 


Y2 


107 


R-5 


DA26 


50 


U-13 


Y24 


75 


C-5 


DAo 


154 


H-17 


Yi 


108 


R-6 


DB2e 


53 


U-14 


Y22 


77 


C-6 


P4 


151 


J-1 


DA14 


20 


R-7 


Vcc 


63 


U-15 


Y20 


81 


C-7 


Vcc 


144 


J-2 


DB14 


19 


R-8 


DB31 


59 


U-16 


Y17 


84 


C-8 


W4 146 1 


J-3 


DB15 


23 


R-9 


MSERR 


-iH- 


U-17 


GND 


85 
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P!N DESIGNATIONS 
(Sorted by Pin Names) 












PIN NAME 


PIN 
NO. 


PAD 
NO. 


PIN NAME 


PIN 
NO. 


PAD 
NO. 


PIN NAME 


PIN 
NO. 


PAD 
NO. 


PIN NAME 


PIN 
NO. 


PAD 
NO. 


BOROW 


A-16 


124 


DB7 


C-1 


3 


I2 


A-10 


137 


Vcc 


T-1 4 


78 


c 


D-15 


119 


DBb 


D-1 


7 


I3 


A-11 


136 


Vcc 


N-1 7 


97 


CP 


C-13 


130 


DBg 


E-1 


9 


U 


B-1 2 


135 


Vcc 


D-1 6 


116 


DAo 


C-5 


154 


DB10 


F-1 


11 


I5 


C-1 2 


134 


Vcc 


H-1 2 


71 


DAi 


B-5 


156 


DBi, 


F-2 


13 


l6 


A-1 2 


133 


Wo 


B-10 


140 


DA: 


B-4 


158 


DB12 


H-1 


15 


I7 


B-1 3 


132 


Wi 


B-9 


141 


DA3 


B-3 


160 


DB13 


H-3 


17 


Is 


A-1 3 


131 


W2 


A-9 


142 


DA4 


C-3 


162 


DB14 


J-2 


19 


L 


C-1 6 


118 


W3 


C-9 


145 


DA5 


A-2 


164 


DB15 


J-3 


23 


MCin 


B-1 5 


126 


W4 


C-8 


146 


DAe 


B-1 


2 


DB16 


K-1 


27 


MLINK 


A-14 


129 


Yo 


E-1 7 


109 


DA7 


C-2 


4 


DBi7 


M-3 


29 


M/m 


A-1 5 


125 


Yi 


H-1 7 


108 


DAg 


E-3 


8 


DB18 


M-1 


31 


MSERR 


R-9 


65 


Y2 


H-1 6 


107 


DAg 


E-2 


10 


DB19 


N-1 


33 


N 


C-1 5 


120 


Y3 


H-1 5 


106 


DA10 


F-3 


12 


DB20 


P-1 


35 


OE-Y 


P-1 5 


87 


Y4 


J-1 7 


102 


DA11 


G-1 


14 


DB21 


P-3 


37 


Po 


B-8 


147 


Ys 


J-1 6 


101 


DA12 


G-2 


16 


DB22 


R-1 


39 


Pi 


A-7 


148 


Ye 


K-1 6 


100 


DA13 


H-2 


18 


DB23 


T-2 


41 


P2 


A-8 


149 


Y7 


K-1 5 


99 


DAi4 


J-1 


20 


DB24 


U-3 


45 


P3 


B-7 


150 


Ys 


M-1 5 


96 


DAis 


K-3 


24 


DB25 


R-4 


47 


P4 


C-6 


151 


Yg 


M-1 7 


95 


DA16 


L-2 


28 


DB26 


U-4 


49 


P5 


B-6 


152 


Y10 


N-1 6 


94 


DA17 


M-2 


30 


DB27 


U-5 


51 


PAo 


D-3 


5 


Y11 


M-1 6 


93 


DAig 


N-3 


32 


DB28 


R-6 


53 


PAi 


K-2 


25 


Y12 


N-1 5 


92 


DA19 


N-2 


34 


DB29 


U-6 


55 


PA2 


U-1 


43 


Yl3 


P-1 6 


90 


DA20 


P-2 


36 


DB30 


U-8 


57 


PA3 


T-9 


61 


Yl4 


R-1 7 


89 


DA21 


R-2 


38 


DB31 


R-8 


59 


PBo 


D-2 


6 


Yl5 


R-1 6 


88 


DA22 


R-3 


40 


GND 


G-3 


21 


PBi 


L-1 


26 


Y16 


T-1 7 


86 


DA23 


T-1 


42 


GND 


R-11 


64 


PB2 


U-2 


44 


Yl7 


U-1 6 


84 


DA24 


T-3 


46 


GND 


G-1 7 


104 


PB3 


U-9 


62 


Yis 


T-1 6 


83 


DA2B 


T-4 


48 


GND 


G-1 5 


104 


PERR 


F-1 7 


111 


Y19 


R-1 5 


82 


DA26 


R-5 


50 


GND 


G-1 6 


104 


PYo 


D-1 7 


115 


Y20 


U-1 5 


81 


DA27 


T-5 


52 


GND 


C-11 


143 


PY1 


E-1 6 


114 


Y2I 


T-1 5 


80 


DA28 


T-6 


54 


GND 


T-1 2 


72 


PY2 


F-1 5 


113 


Y22 


U-1 4 


77 


DA29 


U-7 


56 


GND 


R-1 4 


79 


PY3 


E-1 5 


112 


Y23 


T-1 3 


76 


DA30 


T-7 


58 


GND 


U-17 


85 


RS 


B-1 4 


128 


Y24 


U-1 3 


75 


DA31 


T-8 


60 


GND 


P-1 7 


91 


SLAVE 


C-1 4 


127 


Y25 


R-1 3 


74 


DBo 


A-6 


153 


GND 


K-1 7 


98 


V 


B-1 6 


121 


Y26 


U-1 2 


73 


DBi 


A-5 


155 


GND 


J-1 5 


105 


Vcc 


R-7 


63 


Y27 


T-11 


70 


DB2 


A-4 


157 


GND 


F-16 


110 


Vcc 


L-1 6 


103 


Y28 


U-10 


69 


DB3 


C-4 


159 


GND 


C-1 7 


117 


Vcc 


L-1 5 


103 


Y29 


U-11 


68 


DB4 


A-3 


161 


HOLD 


A-1 7 


123 


Vcc 


L-1 7 


103 


Y30 


T-10 


67 


DB5 


B-2 


163 


lo 


C-10 


139 


Vcc 


C-7 


144 


Y3I 


R-10 


66 


DBe 


A-1 


1 


I1 


B-11 


138 


Vcc 


L-3 


22 


Z 


B-1 7 


122 
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LOGIC SYMBOL 



i^ 

^ 
^ 



^° t' \' ^ 



OFT 



SLAVE V x w 01 

•= PERR 

BOROW 

MCin 

MLINK 

l*m 

CP 

HOLD 



'q-^ '^•'^31 MSERH 



CO 



LS002911 



ORDERING INFORMATION 
Standard Products 

?^mhiSni"'.'',"^"f!f k'^ ^"^"l'?'^ ■" ^^""^ packages and operating ranges. The order number (Valid 
Combination) is formed by a combination of: a. Device Number 

b. Speed Option (if applicable) 

c. Package Type 

d. Temperature Range 

e. Optional Processing 



AM29C332 



G 



-a. DEVICE NUMBER/DESCRIPTION 

Am29C332 

CMOS 32-Bit Arithmetic Logic Unit 



-«. OPTIONAL PROCESSING 

Blani( - Standard processing 
B - Burn-in 



-d. TEMPERATURE RANGE 

C " Commerelal (0 to + 85°C) 



-c. PACKAGE TYPE 

G = 169-Lead Pin Grid Array witlioul Heatsinli 
(CGX169) 



-h. SPEED OPTION 

-1 - Speed Select 
-2- Speed Select (TBD) 



Valid Combinations 


AM29C332 


GO, GCB 


AM29C332-1 



Valid Combinations 

Valid Combinations list configurations planned to be 
supported in volume for this device. Consult the local AMD 
sales office to confirm availability of specific valid 
combinations, to check on newly released combinations, and 
to obtain additional data on AMD's standard military grade 
products. 
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ORDERING JNFCRMATION (Ck)nt'd.) 
APL Products 



AMD products for Aerospace and Defense applications are available in several pacl<ages and operating ranges. APL (Approved 
Products Ust) products are fully compliant with M1L-STD-883C requirements. The order number (Valid Combination) for APL 
products is formed by a combination of: a. Device Number 

b. Speed Option (if applicable) 

c. Device Class 

d. Package Type 

e. Lead Finish 

AM29C332 



-a. DEVICE NUMBER/DESCRIPTION 

Am29C332 

CMOS 32-Bit Arithmetic 1-Ogic Unit 



-e. LEAD FINISH 

C = Gold 



-d. PACKAGE TYPE 

Z- 169-Lead Pin Grid Array without Heatslnic 
(CGX169) 



-c. DEVICE CLASS 

/B = Class B 



b. SPEED OPTION 

Not Applicable 



Valid Combinations 



AM29C332 



/BZC 



Valid Combinations 

Valid Combinations list configurations planned to be 
supported in volume for this device. Consult the local AMD 
sales office to confirm availability of specific valid 
combinations or to checl< for newly released valid 
combinations. 



Group A Tests 

Group A tests include Subgroups 
1, 2, 3, 7, 8, 9, 10, 11. 
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PrN DESCRIPTION 



BOROW Borrow (Input) 

When HIGH, the Carry In and Cany Out are borrows for 
subtract operations. 

C, Z, N, V, L Status (Input/Output) 

When the Register Status pin is LOW, these pins give the 
Carry, Zero, Negative, Overflow and Link outputs of the ALU 
where applicable to the instnjction being executed. When 
not applicable to the instruction being executed, or when the 
Register Status pin is HIGH, these pins give the outputs of 
the Cany, Zero, Negative, Overflow and Link bits of the 
Internal Status Register. In Slave mode, C, Z, N, V and L 
become inputs. 

OP Clock Input (Input) 

Clocks internal registers (status, Q) at the LOW to HIGH 
transition, provided HOLD input Is LOW. 

DA0-DA31 Data Input for DA-bus (Input) 

Data input lines for operand A. 

DB0-DB31 Data Input for DB-bus (Input) 

Data input lines for operand B. 

HOLD Hold (Input, Active HIGH) 

When HIGH, it inhibits the update of the status and Q 
registers. 

lo-le Instruction Inputs (Input) 

Used to select the operation to be performed. 

Ir-lg Byte Width Inputs (Input) 
Byte width Inputs for byte boundary aligned operand 
instivctions. Selects the sources for width and position 
inputs for variable field bit operands. If I7 is LOW It selects 
the width Input from pins W4-W0. If I7 is HIGH the width 
input Is selected from the internal width register. Similarly if 
le is LOW it selects the position Inputs from pins P5 - Pq and 
If HIGH It selects input from the Internal position register. 

MCin Macro Status Carry (Input) 
External Carry Input 

MLINK Macro Status Unk (Input) 

External link input. 

M/in Macro/Micro Select (Input) 

When HIGH, selects macro carry and macro link pins as 
input instead of micro carry and micro link from the micro- 
status register. 



MSERR Master-Slave Error (Output) 

When HIGH, this signal indicates that the master's and 
slave's data were not identical. 

OE-Y Output Enable (Input, Active LOW) 

When oe-Y Is HIGH the Y-bus is disabled (three-stated). 
P0-P5 Position Inputs (Input) 

Position input to select the position of the least significant bit 
of a field. Also Indicates the amount by which data is to be 
shifted up (P5 = LOW) or down (P5 = HIGH) or rotated. 

PA0-PA3 Parity Input for DA-bus (Input) 

Parity Input for operand A on DA-bus (one per byte). 
Even parity Is used for the Am29C332. 

PB0-PB3 Parity Input for DB-bus (Input) 

Parity input for operand B on DB-bus (one per byte). 

PERR Parity Error (Input/Output) 

When HIGH, Indicates that a parity error was detected on 
the DA or DB inputs. 

PY0-PY3 Parity for Y-bus (Input/Output) 

Parity output for data on Y-bus (one per byte). Even parity is 
used for the Am29C332. In slave mode, PYq - PY3 become 
inputs. 

RS Register Status Mode Pin (Input) 

Selects between ALU status (Register Status = LOW) or 
register status (Register Status = HIGH) on the C, Z, N, V 
and L outputs. 

SLAVE Slave (Input) 

When HIGH, this pin puts the ALU in the slave mode. All 
output pins become input pins and signals on them are 
compared with the ALU's Internally generated results. When 
OE-Y Is HIGH, the Y0-Y31 and PY0-PY3 Inputs are 
ignored. When the SLAVE pin is LOW, the ALU is put in 
master mode where outputs are generated as normal. 

W0-W4 Widtli inputs (Input) 

Width input to select the width of a contiguous bit field. 
Y0-Y31 Da ta Out/In Lines (input/Output) 

When OE-Y Is LOW and the ALU is in the M aster m ode, the 
ALU result Is enabled on the Y-bus. When DEY is HIGH, 
the Y-bus is three-stated. In Slave mode the Y-bus acts as 
external data input. 
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Figure 1. Detailed Blocic Diagram 
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Figure 2. Am29C332 Family High-performance System Blocit Diagram 



PRODUCT OVERVIEW 

The Am29C332 is a 32-bit wide, high-performance, non- 
expandable Arithmetic Logic Unit (ALU). It has two 32-bit wide 
input ports (A and B) and one 32-bit wide output port (Y). 
These three ports provide flexibility and accessibility for high- 
performance processor designs. Dedicated input and output 
ports provide a flow-through architecture and avoid the 
penalty associated with switching the bus half-way through the 
cycle for input and output of data. The chip is designed for use 
with a dual-access RAM (Am29C334) as a register file. In 
addition, the three-bus architecture facilitates the connection 
of other arithmetic units in parallel with the Am29C332 for 
high-performance systems. 

The Am29C332 supports one-, two-, three-, and four-byte 
arithmetic operations. It also supports multiprecision arithme- 
tic and multiple bit shifts. For logical operations. It can handle 
variable-length fields of up to 32 bits. The chip incorporates 
dedicated hardware to allow efficient implementation of a two 
bit-at-a-time (modified Booth) multiply algorithm, supporting 
signed and unsigned arithmetic data types. Similarly, hardware 
is provided to support a bit-at-a-time divide algorithm, also 
supporting signed and unsigned arithmetic data types. An 
internal 32-bit register (Q) is used by the multiply and divide 
hardware for double precision operands. For business applica- 
tions, the Am29C332 supports variable-length BCD arithmetic. 

Field logical instructions operate on bit-fields taken from the A 
and B data inputs; they may be of variable width and starting 
position. A is normally the source input and B the destination 
input. In general, destination bits not falling within a specified 
field are passed by the ALU unchanged. Field width and 
position are specified either by direct inputs to the chip, or by 
entries in the status register. There are two kinds of field 
logical instructions - aligned and non-aligned. The first type of 
instruction assumes that source and destination fields are 
aligned and the operation Is performed only for bits within the 
specified fields. In the second type of instruction, source and 
destination fields are normally non-aligned. However, it is 
always assumed that one field (either source or destination) is 
least-significant-bit (LSB) aligned. 

If the destination field Is LSB aligned then the source field is 
downshifted in order to make it LSB aligned as well. Down- 



shifting is accomplished by making the 6-bit position input 
equal to the two's complement of the number of places the 
field is to be downshifted. If the source field Is LSB aligned 
then it is upshifted in order to align it with the destination. 
Upshlfting is accomplished by making the positton inputs equal 
to the number of places the field is to be upshifted. Any other 
type of field operation is not allowed. Whenever the field 
crosses the word boundary, the portion not falling within the 
word boundary is ignored. This effect is useful when perform- 
ing operations on fields that overiap two different words. 
Instructions to perform straightfonward multiple-bit shifts (ei- 
ther up or down) are also provided. Additbnally, it is possible 
to extract a bit-field from a word In one instmction, even if that 
field overiaps a word boundary. 

The power and the flexibility of the processor comes partly 
from its ability to generate a mask to control the width of an 
operation for each instruction without any overhead. For all 
byte aligned instructions (three quarters of the instruction set), 
the mask is either 1 , 2, 3 or 4 bytes wide and is generated from 
the byte width input (la - 17). For all field instructions the mask 
is of variable width and is generated from the position inputs 
(Po - Ps) and the width inputs (Wq - W4). Table 1 describes 
the position displacement from the position inputs and Table 2 
the bit field from the width inputs. 

TABLE 1. POSITION INPUTS AND BIT 
DISPLACEMENT 



Inputs 



Ps 



Pa 



Po 



Bit Displacement 
P 





1 

2 

31 
-32 
-31 

-1 
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TABLE 2. WIDTH INPUTS AND BIT FIELD 



Inputs 



W4 



W3 



W2 



Wi 



Wo 



Bit Field 
w 



32 
1 
2 

31 



Whenever the width of the operand is less than 32-bits. all 
unseleoted bits from the inputs of the ALU are passed to the 
output without any modification. Depending upon the instruc- 
tion type, unselected bits are taken from different sources. For 
example in all single operand instnjctions, bits from the source 
operand (from either A or B input) are passed in unselected bit 
positions. For two operand instructions, bits from the B input 
are passed in unselected bit positions. There are some 
exceptions which are explained in the instruction set section. 

The processor has a 32-bit status register to indicate the 
status of different operations performed. The status register is 
loaded at the rising edge of the clock with new status unless 
the HOLD signal is HIGH. The bit position for each status bit is 
given in the functional description. The least significant byte of 
the status register holds the six position bits (PRq - PFls)- The 
two most significant bits of this byte may be read or loaded but 
are otherarise unused by the ALU. The second byte (bits 8 to 
15) consists of the five width bits (WRq - WR4) and three read- 
only bits that are a combinational function of other status bits, 
and which indicate useful branch conditions. The third byte 
consists of ALU status bits plus bits for high-speed multiply 
and divide. The most significant byte holds intermediate nibble 
canies for BCD operations. An extract-status instruction is 
provided which allows a Boolean value to be formed from any 
selected bit This is particularly useful in machines employing a 
stack architecture. Instructions to save and restore the status 
register are provided. As the entire status of each instoiction is 
stored in the status register, interrupts at any microinstruction 
boundary are feasible. 

The processor has a 32-bit wide priority encoder to support 
floating-point and graphics operations. The priority encoder 
supports all byte aligned data types - the result is dependent 
upon the byte width specified. The result of a priority encode is 
also loaded into the position bits of the status register. The 
result of the prioritize operation can then be used in the 
following clock cycle, e.g., to normalize a floating-point num- 
ber or to help detect the edge of a polygon in graphics 
applications. 

To support system diagnostics, the Am29C332 has a special 
"Master-Slave" mode. To use this mode, two chips are 
connected in parallel, and hence receive the same instructions 
and data. The master chip is used for the normal data path. 
However, in the slave chip, all outputs becomes inputs. The 
slave compares the outputs of the master with its own 
internally generated result. If the two do not match, the slave 
will activate an error signal. 

As a further diagnostic aiti, byte-wise parity checking is 
performed at both the A and B data inputs. The "parity" signal 
is activated if an en-or is detected. Parity bits (one per byte) are 
generated for the 32-bit output bus. 

FUNCTIONAL DESCRIPTION 

A detailed description of each functional block is given in the 
following paragraphs. 



64-Blt Funnel Shifter 

The 64-bit funnel shifter is a combinatorial network. The 64-bit 
input is formed from a combination of the A and B inputs. This 
may be left-shifted by up to 31 bits before being used by the 
ALU. The output of the shifter is the most significant 32 bits of 
the result. The 64-bit shifter can be used on either the A or B 
operands to perform barrel shifts (either up or down) or 
rotates. The operation is controlled by positioning operands 
properly at the input of the 64-bit up-shifter. 

The number "n" by which the operand is shifted comes from 
two sources: the microprogram memory via the Pq - P5 pins or 
the internal register (byte of the status register), PRq - PR5, 
as selected by an instruction bit. 

In general, the 6-bit position input, Pq - Ps, takes a 6-bit two's 
complement number representing upshifts from to 31 places 
(positive numbers) or downshifts from 1 to 32 places (negative 
numbers). 

Mask Generator 

The mask generator logic provides the ability to generate tine 
appropriate mask for an operand of given width and position. 
The generation of the mask depends upon two types of 
instructions. The first type has byte boundary aligned oper- 
ands (widths of either 1, 2, 3 or 4 bytes) with the least 
significant bit aligned to bit 0. The width of an operand is 
specified by the byte width inputs (la and I7) as shown in Table 
3. The second type of instruction has operands of variable 
width (1 to 32 bits) and position. The operand is specified by 
the width inputs (W0-W4) and the position inputs (P0-P5) 
indicating the least significant bit position of the operand. 
Thus, in this type of instruction the operand may or may not be 
least significant bit aligned. Depending upon the type of 
instruction, the mask generator first generates a fence of all 
zeros starting from the least significant bit with the width 
specified either by the byte width or the width input fields. This 
fence can be upshifted by up to 31 bits by the 32-bit mask 
shifter. Whenever the mask is moved up over the 32-bit 
boundary, it does not wrap around. Instead, ONE'S are 
inserted from the least significant end. This configuration 
provides the ability to operate on a contiguous field located 
anywhere in a word, or across a word boundary. 

The mask generator can be used as a pattern generator by 
allowing the mask to pass through ALU (by using the PASS- 
MASK instruction). For example, a single-bit wide mask can be 
generated and by shifting it up by different amounts can give 
walking ONE or walking ZERO patterns for memory tests. 

TABLE 3. 



Is 


I7 


Width in Bytes 








4 





1 


1 


1 





2 


1 


1 


3 



Arithmetic and Logical Unit 

The ALU is a three input unit which uses the mask as a second 
or third operand in every instruction. The mask is used to 
merge two operands. For all selected bits (wherever the mask 
is 0), the desired operation specified by the instruction input is 
performed, and for all unselected bits either corresponding 
destination bits or zeros are passed through. The status of 
each operation (carry, negative, zero, overflow, link) applies to 
the result only over the specified width. For all byte aligned 
arithmetic and logical operations (first three quarters of the 
instruction set), the status is extracted from the appropriate 
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byte boundary. For aH field operations (last quarter of the 
instruction set), the operand width is assumed to be 32 bits for 
status generation. The ZERO flag always indicates the status 
of all bits selected by the mask. 

The actual width of the ALU is 34 bits. There are two extra bits 
used for the high speed signed and unsigned multiplication 
instnjctions. These two bits are automatically concatenated to 
the most-significant end of the ALU depending upon the width 
specified for the operation. Since the modified Booth algorithm 
requires a two-bit down-shift each cycle, these ALU bits 
generate the two most-significant bits of the partial product. 

The ALU is capable of shifting data down by two bits for the 
multiplication algorithm, up by one bit for the divide algorithm 
and single-bit-up-shifts. 

The processor Is capable of performing BCD arithmetic on 
packed BCD numbers. The ALU has separate carry logic for 
BCD operations. This logic generates nibble carries (BCD digit 
carry) from propagate and generate signals formed from the A 
and B operands. In order to simplify the hardware while 
maintaining throughput, the BCD add and subtract operations 
are performed in two cycles. In the first cycle, ordinary binary 
addition or subtraction is performed and BCD nibble carries 
are generated. These are blocked from affecting the result at 
this stage, but are saved in the status register to be used later 
for BCD con-ection (NCq - NC7). In the second cycle all BCD 
numbers are adjusted by examining the previously generated 
nibble cames. Since all the necessary information is stored in 
the status register, the processor can be interrupted after the 
first BCD cycle. 

Priority Encoder 

The priority encoder is provided to support floating-point 
arithmetic and some graphics primitives. The priority encoder 
takes up to 32 bits as input and generates a 5-bit wide binary 
code to indicate location of the most significant one in the 
operand. Input to the priority encoder comes from the input 
multiplexer, which masks all bits that the user does not want to 
participate in the prioritizatron. The priority encoder supports 8, 
16, 24 and 32-bit operations depending upon the byte width 
specified. For each data type the priority encoder generates 
the appropriate binary weighted code. For example, when a 
byte width of two is specified (I7 - la = 10), the output of the 
encoder is zero when bit 15 is HIGH. However, if byte width of 
four is specified (l8-l7 = 00), the output of encoder is 16 
(decimal) if bit 15 is HIGH and bits 31 - 16 are LOW. Table 4 
shows the output for each data type. If none of the inputs are 
HIGH or the most significant bit of the data type specified is 
HIGH, then the output is zero. The difference between these 
two cases is indicated by the Z-f lag of the status register which 
is HIGH only if all inputs are zero. 

Q-Register 

The Q-register holds dividend and quotient bits for division, 
and multiplier and product bits for multiplication. During 
division, the contents of the Q-register are shifted left, a bit at 
a time, with quotient bits inserted into bit 0. During multiplica- 
tion, the contents of the Q-register are shifted right, two bits at 



a time, with product bits inserted into the most-significant two 
bits (according to the selected byte width). The Q-register may 
be loaded from the A or B inputs and read onto the Y bus, 

Master-Siave Comparator 

All ALU outputs (except MSERR) employ three-state buffers. 
The master-slave comparator compares the input and output 
of each buffer. Any difference causes the MSERR signal to be 
made true. In Slave mode, all output buffers are disabled. 
Outputs from a second ALU may then be connected to the 
equivalent pins of the first. The comparator in the slave will 
then detect any difference in the results generated by the two. 
When the Y bus is three-stated by making Output-Enable 
false, the Y bus master-slave comparators are disabled. 

Parity Logic 

For each byte of the DA and DB inputs there is an associated 
parity bit (8 in all). If a parity error is detected on any byte, the 
Parity-En-or signal is made true. Four parity signals (one per 
byte) are also generated for the Y bus outputs. EVEN parity is 
employed for the Am29C332. 

Status Register 

All necessary information about operations performed in the 
ALU is stored in the 32-bit wide status register after every 
microcycle. Since the register can be saved, an interrupt can 
occur after any cycle. The status register can be loaded from 
either the A or B input of the chip and can be read out on the Y 
bus for saving in an external register file. For loading, the byte 
width indicates how many bytes are to be updated. The status 
register is only updated if the HOLD input is inactive. 

Each byte of the status register holds different types of 
information (see Figure 3). The least significant byte (bits to 
7) holds eight position bits (PRq - PR7) for the data shifter. 
The two most significant bits are not used. The next most 
significant byte (bits 8 to 15) holds the 5-bit width field 
(WRo - WR4) for the mask generator. The three most-signifi- 
cant bits of that byte (bits 13 to 15) are read-only bits that 
represent three different conditions extracted from the other 
bits of the status register. They are C + Z, N © V, and (N ffl 
V) + Z for bits 13, 14 and 15 respectively. These bits can be 
read on the Yo pin by the extract-status instruction. The next 
byte contains all the necessary information generated by an 
ALU operation. The least-significant four bits (bits 16 to 19) 
hold carry, negative, overflow and zero flags. Bit 20 holds link 
information for single bit shifts and bits 21 and 22 are used by 
the multiply and divkfe instnjctions. The M flag holds the 
multiplier bit for the modified Booth algorithm or it holds the 
sign comparison result for the divide algorithm. The S flag 
holds the sign of the partial remainder for unsigned division. 
Both the flags (M and S) are provided as a part of the status 
register so that multiply and divide instructions can be inter- 
rupted at microinstnjction boundaries. The most significant 
byte of the status register holds nibble carries for BCD 
arithmetic. Since BCD arithmetic is performed in two cycles, 
the nibble carries are saved in the first cycle and used in the 
second cycle. Since all the information is stored, BCD instruc- 
tions are also interruptible at the microinstruction boundary. 
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TABLE 4. 



Statuso-7: 



Position Register 



Highest Priority 
Active Bit 



l7-l8 = 00 (32-bit) 
None 
31 
30 
29 
28 



l7-l8 = 01 (8-bit) 
None 
7 
6 
5 



l7-l8 = 10 (16-bit) 
None 
15 
14 
13 
12 



Encoder 
Output 



l7-l8 = 11 (24-bit) 
None 
23 
22 
21 
20 



30 
31 



14 
15 



22 
23 



PR7 


PR6 


PR5 


PR4 


PR3 


PR2 


PR1 


PRo 



Status8-12: 
Statusia: 
Statusi4: 
Statusis: 



Width Register 

C-^Z 

N®V 

(N e V) -^ Z 



Read Only 



SIGNED 
LE 



SIGNED 
LT 



UNSIGNED 
LE 



WR4 



WR3 



WRa 



WRi 



Statusie: 


Carry 


Statusi7: 


Negative 


Statusia: 


Overflow 


Statusi9: 


Zero 


StatuS2o: 


Unk 


StatuS2i: 


Multiply (and divide) Bit 


Status22: 


Sign Flag 


Status23: 






WRo 






S 


M 


L 


Z 


V 


N 


C 


23 22 
StatUS24-31- 


21 20 19 

Nibble Carries 


18 


17 


16 


NC7 


NCs 


NC5 


NC4 


NC3 


NC2 


NCi 


NCo 



Note: Overflow is defined as follows: 

V = (carry in to l\/ISB) <» (carry out of MSB) 

Figure 3. ALU Status Register Bit Assignment 
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Am29C332 INSTRUCTION SET 
Data Types 

The Am29C332 supports the following data types: 

1. Integer 

2. Binary-coded decimal 

3. Variable-length bit field 

The first two data types fall into the category of byte boundary 
aligned operands (Figure 4). The size of the operand could be 
1 byte, 2 bytes, 3 bytes or 4 bytes. All operands are least 
significant bit (bit 0) aligned. The byte width is determined by 
bits la and I7 of the instruction as shown in Table 5. 

TABLE 5. 



Is 


I7 


width In 
Bytes 








4 





1 


1 


1 





2 


1 


1 


3 



The third data type has operands of variable width (1 to 32 
bits) as shown in Figure 4. The operand is specified by width 
inputs (W0-W4) and position inputs (P0-P5). The position 
inputs indicate the least significant bit position of the operand. 
Depending on bits Ig and 1 7 of the instruction, the width and 
position inputs can be selected from either the Status Register 
or the Width and Position Pins as shown in Table 6. A 
summary of the data types available is illustrated in Table 7, 







TABLE 6. 




Is 


I7 


Position 


Width j 


Pins 


Reg 


Pins 


Reg 








X 




X 







1 


X 






X 


1 







X 


X 




1 


1 




X 




X 



TABLE 7. 



Data Type 


Size 


Range 


Integer 




Signed Unsigned 


1 byte 


8 bits 


-128 to +127 to 265 


2 bytes 


16 bits 


-2^5 to to 
+ 2^5. 1 2^8-1 


3 bytes 


24 bits 


-2^3 to 223-1 OtO 


4 bytes 


32 bits 


_231 to 231-1 to 


BCD 


1 to 4 bytes 


Numeric, 2 digits per byte. 




(8 digits) 


Most-significant digit may be 
used for sign. 


Variable 


1 to 32 bits 


Dependent on position and 
width inputs. 



Instruction Format 

The Am29C332 has two types of Instruction Formats: 
1. Byte Boundary Aligned Instructions (FORMAT 1): 



31 




23 




15 


7 


1 


Wa 


m 


^ 


WM>. 










m 


m, 







^ 




1 











TB000096 

Byte Boundary Aligned Operands 




TB000630 

Variable-Length Bit Field 

p = Bit displacement of the least significant field virith re- 
spect to bit 0. 
w = Width of bit field. 

Figure 4. Data Types 



TB000098 

2. Variable-Length Field Bit Instructions (FORMAT 2): 



P/PR 



Z] 



TB000099 

For instructions that allow a field to be shifted up or down, 
Po-Ps is a two's-complement number in the range -32 to 
+ 31 representing the direction and magnitude of the shift. For 
instnjctions that assume a fixed field position, Pq - P4 repre- 
sent the position of the least-significant bit of the field and P5 
is ignored. 
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Instruction Classification 

ALU instructions can be classified as follows: 
A. Byte Boundary Aligned Operand Instaictions: 

1. Arithmetic 

- Binary, BCD 

- Multiply steps 

- Division steps {single and multiple precision) 

2. Prioritize 

3. Logical 

4. Single-bit shifts 

5. Data movement 

B. Variable-Length Bit Field Operand Instructions: 

1. N-bit shifts and rotates 

2. Bit manipulations 

3. Field logical operations (aligned, non-aligned, extract) 

4. Mask generation 

Three-fourths of the ALU instructions apply to operands that 
are byte boundary aligned. For these instructions, two orthog- 
onal issues are the width of the operand (in bytes) and the 
contents of the high order unselected bytes on the Y bus. As 
mentioned earlier, the width of the operand is specified by Is 
and I7. With the exception of a few instructions, the unselected 
bytes are assigned values as follows: for single operand 
instnjctions, unselected bytes are passed unchanged from the 
source (A or B). For two operand instructions, unselected 
bytes are passed unchanged from the destination (B input). 

In the last quarter of the instruction set, the width of the 
operand is from 1 to 32 bits (based on the width input) for field 
operations, 32 bits for N-bit shift operations and 1-bit for bit- 
oriented operations. In the case of field-aligned and single-bit 
operands, the position bits (P0-P4) determine the least 
significant bit of the operand. In the case of N-bit shifts and 
field non-aligned operands, the position bits Pq - P5 is a 6-bit 
signed integer determining the magnitude and direction of the 
shift. 

Flags 

Byte-Aligned Instructions 

The zero flag always looks only at the selected bytes: 
Z <- (Y and bytemask (byte width) = 0) 



Similarly, N " sign bit (Y, byte width), where the function 
"sign-bit" returns bit ?, 15, 23, or 31 of the first argument for 
byte widths 01, 10, 11, or 00 respectively. 
Also, C '- carry (byte width) returns the carry from the 
appropriate byte boundary, and: 
V ^ overflow (byte width) = (carry into MSB) ffl (carry 
out of MSB) 
returns the overflow from the appropriate byte boundary. 

The link (L) flag is generally loaded with the bit moved out of 
the highest selected byte in the case of upshifts, or the bit 
moved out of the least significant byte for downshifts. Figure 5 
shows the shift operation using link bit. Other status flags have 
specialized uses, explained in the following sections. 

Shift Down: 


1 





M 
U 
X 




—1, 2. 3, or 4 bytes — ► 






L 


i A(orB) 




L 


r 


sign bit 













Shift with sign bit fill implements arithmetic sliifL 





— 




1,2, 3, or 4 bytes 




M 
U 
X 




L 




A(orB) 















b 



Figure 5. Upshift/Downshift Using Link Bit 
Variable-Length Field Instruction: 

Generally, only N and Z are affected. N takes the most- 
significant bit of the 32-bit result (i.e., N - Y31). Z detects 
zeros in the selected field of the result (i.e., Z ^ (Y and 
bitmask (position, width) = 0)). 

Output Select 

The Register Status pin, RS, may be used to switch the C, Z, 
N, V, and L output pins between the direct output of the ALU 
and the outputs of the corresponding bits in the status register. 
If the direct status output is selected, then for instnjctions that 
do not affect a particular flag (e.g., carry for logical anthmetic) 
that output will reflect the state of its corresponding bit in the 
status register. Similarly, when the HOLD signal is made 
HIGH, the C, Z, N, V and L pins will be made equal to the 
contents of the status register, regardless of the RS input. 
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INSTRUCTION SET SUMMARY 



Operand Size: Variable Byte Width: 1, 2, 3, 4 Bytes 



Type 



Arithmetic 



Prioritize 



Logical 



Single-Bit 
Shifts 



Data 
Movement 



Operation 



• Increment by one, two, four 

• Decrement by one, two, four 

• Add, addc (cany = macro/micro) 

• Sub, subr 

• Subc, subrc (carry/borrow) 

• BCD sum and difference correct steps 



• Negate (two's complement) 

• Multiply steps (modified Booth) 

• Divide steps (non-restoring) 



(Signed and unsigned) 



• Prioritize 



• Not, OR, AND, XOR, XNOR, zero, sign 



• Upshift with 0, 1, link fill 

> Downshift with 0, 1, link, sign 



j (Single and double precision) 



Data Type 



Binary Integer 
and BCD 



Binary Integer 



Binary 



• Zero extend 

• Sign extend 

• Pass-status, Q-Reg 

• Load-status, Q-Reg 

• Merge 



Binary 
Binary 



Binary 



Operand Size: 32 Bits 



Type 



N-BIt Shifts 
N-Bit Rotates 



Operation 



• Upshift by to 31 bits with fill 

• Downshift by 1 to 32 bits with 0, sign fill 
> Rotate by to 31 bits 



Data Type 



Binary 



Operand Size: Singie Bit 



Type 



Bit 
Manipulation 



Operation 



• Extract 

• Set 

• Reset 



Data Type 



Binary 



Operand Size: Variable Length Bitfield: 1 to 32 Bits 



Type 



Field Logical 
(aligned and 
non-aligned) 



Mask 



Operation 



• Not, OR, XOR, AND, extract, insert 



• Pass-mask 



Data Type 



Binary 



Binary 
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iNSTRUCTiON SET GLOSSARY 










(Sorted by Opcode In Hex Notation) 






Opcode 


Name 


Opcode 


Name 


Opcode 


Name 


Opcode 


Name 


00 


ZERO-EXTA 


20 


DN1-0F-A 


40 


AND 


60 


NB-SN-SHA 


01 


ZERO-EXTB 


21 


DN1-0F-B 


41 


XNOR 


61 


NB-SN-SHB 


02 


SIGN-EXTA 


22 


DN1-0F-AQ 


42 


ADD 


62 


NB-OF-SHA 


03 


SIGN-EXTB 


23 


DN1-0F-BQ 


43 


ADDC 


63 


NB-OF-SHB 


04 


PASS-STAT 


24 


DN1-1F-A 


44 


SUB 


64 


NBROT-A 


05 


PASS-Q 


25 


DN1-1F-B 


45 


SUBC 


65 


NBROT-B 


06 


LOADQ-A 


26 


DN1-1F-AQ 


46 


SUBR 


66 


EXTBIT-A 


07 


LOADQ-B 


27 


DN1-1F-BQ 


47 


SUBRC 


67 


EXTBIT-B 


08 


NOT-A 


28 


DN1-LF-A 


48 


SUM-CORR-A 


68 


SETBIT-A 


09 


NOT-B 


29 


DN1-LF-B 


49 


SUM-CORR-B 


69 


SETBIT-B 


OA 


NEG-A 


2A 


DN1-LF-AQ 


4A 


DIFF-CORR-A 


6A 


RSTBIT-A 


OB 


NEG-B 


2B 


DN1-LF-BQ 


4B 


DIFF-CORR-B 


6B 


RSTBIT-B 


OC 


PRIOR-A 


2C 


DN1-AR-A 


4C 


- 


60 


SETBIT-STAT 


OD 


PRIOR-B 


2D 


DN1-AR-B 


4D 


- 


6D 


RSTBIT-STAT 


OE 


MERGEA-B 


2E 


DN1-AR-AQ 


4E 


SDIVFIRST 


6E 


NOTF-AL-B 


OF 


MERGEB-A 


2F 


DN1-AR-BQ 


4F 


UDLVFIRST 


6F 


PASSF-AL-B 


10 


DECR-A 


30 


UP1-0F-A 


50 


SDIVSTEP 


70 


NOTF-A 


11 


DECR-B 


31 


UP1-0F-B 


51 


SDIVLAST1 


71 


NOTF-AL-A 


12 


INCR-A 


32 


UP1-0F-AQ 


52 


MPDIVSTEP1 


72 


PASSF-A 


13 


INCR-B 


33 


UP1-0F-BQ 


53 


MPSDIVSTEP3 


73 


PASSF-AL-A 


14 


DECH2-A 


34 


UP1-1F-A 


54 


UDIVSTEP 


74 


ORF-A 


15 


DECR2-B 


35 


UP1-1F-B 


55 


UDIVLAST 


75 


ORF-AL-A 


16 


INCR2-A 


36 


UP1-1F-AQ 


56 


MPDIVSTEP2 


76 


XORF-A 


17 


INCR2-B 


37 


UP1-1F-BQ 


57 


MPUDIVSTP3 / 


77 


XORF-AL-A 


18 


DECR4-A 


38 


UP1-LF-A 


58 


REMCORR ' 


78 


ANDF-A 


19 


DECR4-B 


39 


UP1-LF-B 


59 


QUOGORR 


79 


ANDF-AL-A 


1A 


INCR4-A 


3A 


UP1-LF-AQ 


SA 


SDIVLAST2 


7A 


EXTF-A 


IB 


INCR4-B 


38 


UP1-LF-BQ 


SB 


UMULFIRST 


7B 


EXTF-B 


1C 


LDSTAT-A 


3C 


ZERO 


5C 


UMULSTEP 


7C 


EXTF-AB 


ID 


LDSTAT-B 


3D 


SIGN 


5D 


UMULLAST 


7D 


EXTF-BA 


IE 




3E 


OR 


5E 


SMULSTEP 


7E 


EXTBIT-STAT 


1F 


- 


3F 


XOR 


5F 


SMULFIRST 


7F 


PASS-MASK 
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TABLE 6-1. DATA MOVEMENT INSTRUCTIONS 



Mnemonics 


Code 


Description 


Y Output 


Status 


Unsei 


Bel 


S 


M 


L 


Z 


V 


N 


C 


ZERO-EXTA 


00 


Zero Extend 





A 
















ZERO-EXTB 


01 







B 
















SIGN-EXTA 


02 


Sign Extend 


Sign 


A 
















SIGN-EXTB 


03 




Sign 


B 
















MERGEA-B 


OE 


Merge A with B 


B 


A Merge B 
















MERGEB-A 


OF 


Merge B witti A 


A 


B Merge A 





















TABLE 6-2. DATA MOVEMENT INSTRUCTIONS 
















Mnemonics 


Code 


Description 


Y Output 


Status Register 


Status 


Unset 


Sel 


S 


M 


L 


Z 


V 


N 


C 


PASS-STAT 


04 


Pass Status Register 


B 


8 


















LDSTAT-A 


1C 


Load Status Register 


S 


A 


A 


+ 


+ 


+ 


+ 


+ 


+ 


+ 


LDSTAT-B 


1D 




S 


B 


B 


+ 


+ 


+ 


+ 


+ 


+ 


+ 







TABLE 6-3. DATA MOVEMENT INSTRUCTIONS 
















Mnemonics 


Code 


Description 


Y Output 


Q Register 


Status 


Unsei 


Sel 


S 


M 


L 


Z 


V 


N 


C 


PASS-Q 


05 


Pass Q Register 


8 


Q 


















L0AtX3-A 


06 


Load Q 


Q 


A 


A 








• 




* 




LOAIM-B 


07 




Q 


B 


B 








• 




• 





Legend: 



Examples: 



Unsei = Unselected Byte(s) 
Sel = Selected Byte(s) 
A - A Input 
B - B Input 
Q - Q Register 

+ - Updated only if byte width is 3 or 4 
■ = Updated 

Z. ZERO EXTB Pass lower two bytes of B to Y witti zero fill on upper two bytes 

0, LOADQ-A Load all four bytes of A into Q Register pass updated Q Resistor to Y 
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TABLE 7. LOGICAL INSTRUCTIONS 



Mnemonics 


Code 


Description 


Y Output 


Status 


Unsel 


Sei 


S 


M 


L 


Z 


V 


N 


c 


NOT-A 


08 


One's Complement 


A 


A 








• 




• 




NOT-B 


09 


B 


B 








* 




♦ 




ZERO 


3C 


Pass Zero 


B 











1 









SIGN 


3D 


Pass Sign 


B 


0(N=0); -1(N = 1) 








N 








OR 


3E 


OR 


B 


A OR B 








• 




* 




XOR 


3F 


EXOR 


B 


A XOR B 








• 




• 




AND 


40 


AND 


B 


A AND B 








* 




* 




XNOR 


41 


XNOR 


B 


A XNOR B 








« 




• 




Note: 1, These in 


structions 


use the byte aligned instaiction format (FORMAT 1). 

















Examples: 



Legend: Unsel = Unselected Byte(s) 
Sel = Selected Byte(s) 
A = A Input 
B = B Input 
Q = Q Register 
* = Updated 

2, NOT-A Complement low order two bytes of A and output to Y witli 

high order two bytes of A uncomplemented. 

1, AND AND first byte of A and B. Output to Y with high three 

bytes of B. 

TABLE 8-1. SINGLE-BIT SHIFT INSTRUCTIONS (SINGLE PRECISION) 



iUnemonics 


Code 


Description 


Y Output 


Status 


Unsei 


Sel 


S 


M 


L 


Z 


V 


N 


C 


DN1-0F-A 


20 


Downshift, Zero Fill 


A 


Yi = Ai + i, Ymsb = 












* 




DN1-0F-B 


21 


B 


Vi = Bi + i, Ymsb = 












• 




DN1-1F-A 


24 


Downshift, One Fill 


A 


Yi = Ai + i, Ymsb = 1 












* 




DN1-1F-B 


25 


B 


Yi = Bi + i, Ymsb = 1 












• 




DN1-LF-A 


28 


Downshift, Unit Fill 


A 


Yi = A| + i, Ymsb = l- 












• 




DN1-LF-B 


29 


B 


Yi = Bi + i, Ymsb = L 












* 




DN1-AR-A 


2C 


Downshift, Sign Fill 


A 


Yi = Aj + i, Ymsb = N 












* 




DN1-AR-B 


2D 


B 


Yi = Bi + i, Ymsb = N 












* 




UP1-0F-A 


30 


Upshift, Zero Fill 


A 


Yi = Ai-i, Yo = 












• 




UP1-0F-B 


31 


B 


Yi = Bi.i, Yo = 












• 




UP1-1F-A 


34 


Upshift, One Fill 


A 


Yi = Ai_i, Yo=1 












* 




UP1-1F-B 


35 


B 


Yi = Bi.i, Yo = 1 












* 




UP1-LF-A 


38 


Upshift, Linlt Fill 


A 


Yi = Ai.i, Yo = L 












* 




UP1-LF-B 


39 


B 


Yi = Bi_i, Yo = L 












* 





Note: 1. These instructions use the byte aligned instruction fomiat (FORMAT 1). 
Example: 



2, UP1-1F-A 



Shift lower two bytes of A up one bit. Set LSB to 1. Fill 
unselected bytes to upper two bytes of A. 



2-55 



TABLE 8-2. SINGLE-BIT SHIFT INSTRUCTIONS (DOUBLE PRECISION) 



Mnemonics 


Code 


Description 


Y Output & Q Register 


Status 


Selected Bytes 


S 


M 


L 


Z 


V 


N 


C 


DN1-0F-AQ 


22 


Downshift, Zero Fill 


^ A -> Q 2) 
















DN1-0F-BQ 


23 


^. B ^^ Q 3) 
















DN1-1F-AQ 


26 


Downsiiift, One Fill 


1 -»■ A -* Q 2) 
















DN1-1F-BQ 


27 


1 ^^ B -* Q 3) 
















DN1-LF-AQ 


2A 


Downshift, Link Fill 


L ^ A ^- Q 2) 
















DN1-LF-BQ 


2B 


L — »• B -»• Q 3) 
















DN1-AR-AQ 


2E 


Downshift, Sign Fill 


N -* A -» Q 2) 
















DN1-AR-BQ 


2F 


N -»• B ^ Q 3) 
















UP1-0F-AQ 


32 


Upshift, Zero Fill 


A <- Q <— 2) 
















UP1-0F-BQ 


33 


B «- Q <— 3) 
















UP1-1F-AQ 


36 


Upshift, One Fill 


A *- Q *- 1 2) 
















UP1-1F-BQ 


37 


B ^ Q ^ 1 3) 
















UP1-LF-AQ 


3A 


Upshift, Link Fill 


A <- Q *— L 2) 
















UP1-LF-BQ 


3B 


B <- Q <— L 3) 

















Notes: 1. These instructions use the byte aligned instruction format (FORMAT 

2. Y Unselected byte from A, Q Unselected byte unchanged. 

3. Y Unselected byte from B, Q Unselected byte unchanged. 

Legend: Unsel = Unselected Byte(s) 
Sel = Selected Byte(s) 
A = A Input 
B = B Input 
Q-Q Register 
* - Updated 

Example: 



0. DN1 -AR-BQ SNft 64 bits (all 32 bits of both B and Q) 

down by one bit. LSB of B fills MSB of Q. 
MSB of B sat to sigr) bit (bit N of status register). 




B (32 bits) 



D-C 



Q (32 bits) 



sign bit 



>-*P 



3,UP1-LF-AQ 



fink status bit 



Shift 48 bits (24-bits of A and 24-bits of Q) 
up by one bit. MSB of 24-bit Q fills LSB of A. 
MSB of 24-bit A sets link status bit. LSB of 
Q is filled with original link value. 



fff^/^i^^bits) -] P^i ^Q(24bitS) HMg 
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TABLE 9. PRIORITIZE INSTRUCTIONS 



Mnemonics 


Code 


Description 


Y Output 


Status 


S 


M 


L 


Z 


V 


N 


C 


PRIOR-A 


oc 


Prioritization 


Location of Highest 1 Bit 








* 








PRIOR-B 


OD 








* 









Notes: 1. These instructions use the byte aligned instruction format {FORMAT 1). 

2. Priority also loaded Into STATUS <7:0> 

3. Refer to Table 4. 



Legend: 



Example: 



A = A Input 
B = B Input 
Q = Q Register 
* = Updated 



Assume A is 



3, PRIOR-A Value placed on Y is 2 

i 



00000000 00000000 



TABLE 10-1. ARITHMETIC INSTRUCTIONS 



Example: 



Mnemonics 


Code 


Description 


Y Output 


Status 


Unsel 


Sel 


S 


M 


L 


Z 


V 


N 


C 


NEG-A 


OA 


Two's Complement 


A 


A + 1 










* 


• 


• 


NEG-B 


OB 


B 


B + 1 










* 


* 


* 


INCR-A 


12 


Increment by One 


A 


A+1 










* 


• 


• 


INCR-B 


13 


B 


B + 1 










* 


* 


* 


INCR2-A 


16 


Increment by Two 


A 


A + 2 










* 


* 


• 


INCR2-B 


17 


B 


B + 2 










* 


* 


* 


INCR4-A 


1A 


Increment by Four 


A 


A + 4 










* 


* 


* 


iNCR4-B 


IB 


6 


B + 4 










* 


* 


* 


DECR-A 


10 


Decrement by One 


A 


A-1 










* 


* 


* 


DECR-B 


11 


B 


B-1 










* 


* 


* 


DECR2-A 


14 


Decrement by Two 


A 


A-2 










* 


* 


• 


DECR2-B 


15 


B 


B-2 










• 


* 


* 


DECR4-A 


18 


Decrement by Four 


A 


A-4 










* 


• 


* 


DECR4-B 


19 


B 


B-4 










* 


* 


* 



Notes: 1. These instructions use the byte aligned instnjction format (FORMAT 1). 

2. Borrow, rather than carry, is generated if BOROW is HIGH (borrow = carry), 

3. Nibble bits are set by these instructions. NEG-A (or NEG-B) and DIFF-CORR may 
form 10's complement of a BCD number. Use SUM-CORR (for increment) or DIFF 
decrement) to increment or decrement a BCD number. 

Legend: Unsel = Unsetected Byte(s) 
Sel = Selected Byte(s) 
A = A Input 
B = B Input 
Q = Q Register 
' = Updated 



be used to 
-CORR (for 



DECR4-A 



Decrement lower two bytes of A by 4 
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TABLE 10-2. ARITHMETIC INSTRUCTIONS 



Mnemonics 


Code 


Description 


Y Output 


Status 


Unsel 


Sel 


S 


M 


L 


Z 


V 


N 


C 


ADD 


42 


Add 


B 


A + B 
















ADDC 


43 


Add with Carry 


B 


A + B + C 6) 
















SUB 


44 


Subtract 


B 


A + B+1 
















SUBR 


46 


B 


B + A + 1 
















SUBC 


45 


Subtract with Carry 


B 


A + B + 1 + C 2) 6) 
















SUBRC 


47 


B 


B + A + 1 + C 2) 6) 
















SUM-CORR-A 


48 


Correct BCD Nibbles 
for Addition 


A 


Corrected A 3) 
















SUM-CORR-B 


49 


B 


Corrected B 3) 
















DIFF-CORR-A 


4A 


Correct BCD Nibbles 
for Subtraction 


A 


Corrected A 3) 
















DIFF-CORR-B 


4B 


B 


Corrected B 3) 

















Notes: 1. Thase instructions use the byte aligned instruction format (FORMAT 1). 

2. BOROW is LOW. For subtract operations, a bort<m rather than a carry is stored in STATUS if BOROW is HIGH. 
Carry is always generated for ADD regardless of BOROW. 

3. First, the nibble carries NC0-NC7 are tested. Any nibble carry/borrow that is set to 1 generates "6" internally as 
a correction word and then the con-ection word is added (SUM-CORR- ) or subtracted (DIFF-CORR- ) from the 
operand. NCo - NC7 are not affected by this operation. 

4. Use SUM-CORR or DIFF-CORR to add or subtract a BCD number. 

5. Use ADDC, SUBC, or SUBRC to perform operations on integers longer than 32 bits. 

6. Carry bit is obtained from MCin if M/m is HIGH. Othen»ise, carry is obtained from the C status bit. 



Legend; 



Unsel - Unselected Byte(s) 
Sel -Selected Byte(s) 
A- A Input 
B - B Input 
Q-Q I 



Example: 



* = Updated only if byte width is 3 or 4 

0, ADD Add two 32-bit two's-comptement integers 
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TABLE 11-1. DIVIDE INSTRUCTIONS (Aligned Format) 


Name 


l6-l0 
Code 


description 


Source for 

Unselected 

Bytes 


Output 


Status 


S 


IM 


L 


Z 


V 


N 


c 


Signed Divide Steps | 


SDIVFIRST 


4 E 


Rrst Instruction for Signed Divide 


B 


Y. Q 


• 


• 


• 


• 




* 




SDIVSTEP 


5 


iterate Step (#bits - 1 times) 


B 


Y, Q 




• 


• 


• 




* 


« 


SDIVLAST1 


5 1 


Last Divide Instruction Unless 


B 


Y, Q 




* 




* 




• 


« 


SDIVLAST2 


5 A 


Dividend & Remainder Negative 


B 


Y 








* 








Unsigned Divide Steps | 


UDIVHRST 


4 F 


First Instructbn for Unsigned Divide 


B 


Y, Q 






* 


* 




• 




UDIVSTEP 


5 4 


Iterate Step (#bits - 1 times) 


6 


Y, Q 


* 


* 


* 


* 






• 


UDIVLAST 


5 5 


Last Instruction 


B 


Y, Q 





* 




• 




* 


* 


IMultipreclslon Divide Steps | 


MPDIVSTEP1 


5 2 


First Instnjction 


B 


Y, Q 
















MPDIVSTEP2 


5 6 


Executed Times for Double 


B 


Y, Q 
















MPSDIVSTEP3 


5 3 


Last Instruction of Inner Loop 


B 


Y, Q 
















MPUDIVSTP3 


5 7 


Used for Unsigned Divide 


B 


Y, Q 
















Correction Steps | 


REK^CORR 


5 8 


Correct Remainder After Divide 


B 


Y 














• 


QUOCXIRR 


5 9 


Correct Quotient After Divide 


B 


Y 










• 




* 


TABLE 11-2. EXAMPLE CODING FORM (Signed Division) 


Am29C331 


Am29C332 


Am29C334 


I 
1 

E 
< 


OP 


Brancli 


Cond 
Select 


MulU 
Sel 


B/W 


OP 


WIdtii 


Position 


A-IN 


B-IN 


Y-OUT 


OE 


CONT 








2 


LOADQ-A 






R2 






1 


CONT 











SIGN 










R3 





FOR_D 


15 






2 


SDIVFIRST 






R4 


R3 


R3 





DJMP_S 








2 


SDIVSTEP 






R4 


R3 


R3 





CONT 








2 


SDIVLAST1 






R4 


R3 


R3 





BRCC_D 


DONE 


Z 


















1 


CXINT 








2 


SDIVLAST2A 






R4 


H3 


R3 





CONT 








2 


PASS-Q 










R1 





CONT 








2 


QUOCORR 








R1 


R1 





CONT 








2 


REMCORR 






R4 


R3 


R3 





Note: Divisor in A, Dividend in A 

Quotient In Q, Remainder in B 

Legend: A = A Input 
B = B Input 
S = Status Register 
Q = Q Register 

R1 = Quotient 

R2 = Dividend 

R3 = Remainder 

R4 - Divisor 
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TABLE 12-1. MULTIPLY INSTRUCTIONS (Aligned Format) 


Name 


l6-l0 

Code 


Deacrtptlon 


Source for 

Unselected 

Bytes 


Output 


Status 


S 


M 


L 


Z 


V 


N 


c 


Signed Multiply Steps | 


SMULFIRST 


5 F 


First multiply instruction 


B 


y(1) 












" 




SMULSTEP 


5 E 


Iterate step (#bits/2 - 1 steps) 


B 


y(i) 
















Unsigned Multiply Steps 


UMULFIRST 


5 B 


First multiply instoictton 


6 


y(i) 




• 












UMULSTEP 


5 C 


Iterate step (# bits/2 - 1 steps). 


B 


y(1) 




• 












UMULLAST 


5 D 


Last multiply instruction 


B 


yd) 








• 








TABLE 12-2. EXAMPLE CODING FORM (Unsigned Multiply) 


Am29C331 


Am29C332 


Am29C334 


> 

CM 

1 


OP 


Brancti 


Cond 
Select 


Multl 
Sei 


B/W 


OP 


Width 


Position 


A-iN 


B-IN 


Y-OUT 


5E 


CONT 








3 


ZERO 








R3 


R3 





CONT 








3 


LOADQ-A 






R1 






1 


FOR_D 


11lO 






3 


ULMULFIRST 






R2 


R3 


R3 





DJMP_S 








3 


UMULSTEP 






R2 


R3 


R3 





CONT 








3 


UMULLAST 






R2 


R3 


R3 





CONT 








3 


PASS^ 










R4 





Note; 1. Put ALU output in B. 

2. Multiplicand in A, Multiplier in Q 

Product (HIGH) in B, Product (LOW) in Q 

Legend; A - A Input 

B - B Input 

S - Status Register 

Q - Q Register 
R1 - Multiplier 
R2 - Multiplicand 
R3 - Product (HIGH) 
R4 - Product (LOW) 
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TABLE 13. SHIFT/ROTATE INSTRUCTIONS 



Mnemonics 


Code 


Description 


Y Output 


Status 


S 


M 


L 


Z 


V 


N 


c 


NB-OF-SHA 


62 


Field Shift, Zero Fill 


Yi + p = Ai, 2) 












« 




NB-OF-SHB 


63 


Yi + p = Bi. 2) 












* 




NB-SN-SHA 


60 


Field Shift, Sign Fill 


Yi + p-Ai, N 2) 












* 




NB-SN-SHB 


61 


Yi + p = Bi, N 2) 












* 




NBROT-A 


64 


Field Rotate 


Yi = A(i_p)mod32 3) 












* 




NBROT-B 


65 


Yi = B(i.p)nt,od32 3) 












* 





Notes: 1. These instructions use the field instruction format (FORMAT 2). 

2. "p" stands for bit displacement from P0-P5 or from PR0-PR5 (-32<p<31). 
If p Is positive, Yp-1 to Yo are equal to the fill bit. 

If p is negative, Y31 to Y31 + p + 1 are equal to the fill bit 

3. The sign of the position input is ignored for this Instruction and Pq - P4 are treated as a positive magnitude for a 
circular upshift 

Legend: A = A Input 
B = B Input 
Q = Q Register 
* = Updated 



Examples: * 

NB-0F-SHA„4 Shift A up 4 bits and zero fill 

NB-0F-SHB„-17 Shift B down 17 bits and sign I 

'Width field not used 



TABLE 14-1. BIT-MANIPULATION INSTRUCTIONS 



Mnemonics 


Code 


Description 


Y Output 


Status 


Unsel 


Sel 


S 


M 


L 


z 


V 


N 


C 


SETBIT-A 


68 


Bit Set 


A 


Yi = Ai, Yp = 1 








* 




* 




SETBIT-B 


69 


B 


Yi = Bi, Yp = 1 








* 




* 




RSTBIT-A 


6A 


Bit Reset 


A 


Yi = Ai, Yp = 








* 




* 




RSTBIT-B 


6B 


B 


Y| = Bi, Yp = 








* 




* 




EXTBIT-A 


66 


Bit Extract 





if p > 0, Yo = Ap 2) 
if p < 0, Yo = Ap 






* 


* 








EXTBIT-B 


67 





if p > 0, Yo = Bp 2) 
if p < 0, Yo = Bp 






* 


* 








EXTBIT-STAT 


7E 





ifp>0, Yo = Sp 2) 
if p < 0, Yo = Sp 






* 











Notes: 1. These Instrucfions use ttie field instnjction format (FORMAT 2). 

2. Y31 to Yi are set to zero, "p" stands for the bit displacement from P0-P4 or from PHq-PRs- The sign of the position input is 
ignored. 

TABLE 14-2. BIT-MANIPULATION INSTRUCTIONS 



Mnemonics 


Code 


Description 


Status Register 


Y Output 


status 


S 


M 


L 


Z 


V 


N 


C 


SETBIT-STAT 


6C 


Status Bit Set 


Sp = 1 


S 


* 


* 


* 


* 


* 


* 


• 


RSTBIT-STAT 


6D 


Sp = 


S 


* 


* 


« 


* 


* 


* 


• 



Notes: 1. These instructions use the Field instruction format (FORMAT 2). 

2. "p" stands for the bit displacement fix)m P0-P5 or from PR0-PR5. 



Legend: 



Examples: 



Unsel = Unselected field 
Sel = Selected field 
A - A Input 
B = B Input 
Q = Q Register 
• = Updated 



RSTBIT-B„3 
EXTBIT-STAT,,- 



3rd bit is set to in B 
4th bit in status register is extracted and 
inverted. 
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Aligned Fields 



ii'A2 


A 


I'^l 


1 



^^ 



[^"^.^ 



-w 









Non-Aliqned Fields Case 1 : 



k-W- 




B 



If position (F\)-P5) ^ 0, A is LSB aligned 
Width (Wq-W,) = 1 to 32 



A: 



Non-Alianed Fields Case 2 

U— W — 4*-P->| 




AopB 



If position (Pn-^s) < 0, B is LSB aligned 
Width (Wq-V\^) = 1 to 32 

Figure 6. Field Logical Operations 



P-*^ 



2l 



LD000140 
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TABLE 15. FIELD LOGICAL INSTRUCTIONS 



Mnemonics 


Code 


Description 


Y Output 


Status 


Unsel 


Sei 


S 


M 


L 


Z 


V 


N 


C 


PASSF-AL-A 


73 


Field Pass 3) 
3) 
4) 


B 


Yi = Ai 
















PASSF-AL-B 


6F 


B 


Yi = Bi 
















PASSF-A 


72 


B 


if p>0, Yi = Ai_p 
if p<0, Yi-p| = Ai 
















NOTF-AL-A 


71 


Field Complement 3) 
3) 
4) 


B 


Yi-Ai 
















NOTF-AL-B 


6E 


B 


Yi = Bi 
















NOTF-A 


70 


B 


if p>0, Yi = Ai-^ 
if p<0, Yi.p| = Ai 
















ORF-AL-A 


75 


Field OR 3) 
4) 


B 


Yi-Ai OR Bi 
















ORF-A 


74 


B 


if p>0, Yi = Ai.p OR Bi 
if p<0, Yi.|p| = AiOR Bi_p| 
















XORF-AL-A 


77 


Field XOR 3) 
4) 


B 


Yi = Ai XOR Bi 
















XORF-A 


76 


B 


if p>0, Yi = Ai-p XOR Bi 
if p<0, Yi.p| = AiXOR Bi-pi 
















ANDF-AL-A 


79 


Field AND 3) 
4) 


B 


Yi - Ai AND Bi 
















ANDF-A 


78 


B 


H p>0, Yi = Ai.p AND B| 
if p<0, Yi_p| = Ai AND Bi-pi 
















EXTF-A 


7A 


Field Extract 4) 5) 
4)5) 





if p>0, Yi = Ai.p 
if p<0, Yi-p| = Ai 
















EXTF-B 


7B 





if p>0, Y| = Bi_p 

if p<0, Yi-p| = Bi 
















EXTF-AB 


7C 





6) 
















EXTF-BA 


7D 





^' 


_ 















Notes; 1. These instructions use the field instruction format (FORMAT 2). 

2. p<i<p + w-1, "p" stands for position displacement from P0-P5 or from PR0-PR5 and "w" for the width of the bit field- 
from W0-W4 or WR0-WR4. Whenever p + w >32,. operation takes place only over the portion of the field up to the end of 
the word. No wraparound occurs. 

3. This instruction uses the aligned format (see Figure 6). 

4. This instruction uses the unaligned field format (see Figure 6). 
p>0: Case 1 

p < 0: Case 2 

5. If p is positive, the input is LSB aligned and Y output aligned at position. 
If p is negative, the input is aligned at |p| and Y output at LSB. 

6. Firstly, the concatenation of A(High Word) and B(Low Word) is rotated by the amount specified by the position (p). If p is 
positive, left-rotate is performed. If p is negative, right-rotate is performed. Secondly, tfie least significant bits on the Y output 
specified t)y the width (w) are extracted. 

7. Same as 6) except that B input is taken as a high word and A input as a low word. 



Legend: 



Unsel = Unsetected Field 
Sei = Selected Reld 
A = A Input 
B-B Input 
Q = Q Register 
* = Updated 



For all examples, assume STATUS (7:0) is -7 and STATUS (12:8) Is 3. 



1. 0,PASSF-AL-B,11,20 



Pass B to Y and test if B20 to B30 
are all zero. Set Z status if so. 



B: i |ooooooooooob ooooioioiiiooiioioo 

Z set to 1 in this case 



2. 3,X0RF-A„ Exdusive-OR bits A7-A9 with bits 

Bo- B2 and output to Yq-Yj. Pass 
B3 - 831 to Y3 - Y31 . Width and po- 
sition values are obtained from STA- 
TUS(12:0). 

A: 01101 t100010010000101l [i"ool l10101 1 

B: 00011 100001 01 00011 001 01 001 OOl [ooT| 

A9.7©B2_o-Y: 00011 1 00001 01 00011 001 01 001 OOl|Toi1 
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TABLE 


16. MASK INSTRUCTION 














Mnemonics 


Code 


Description 


Y Output 


Status 


Unsei 


Sei 


S 


M 


L 


Z 


V 


N 


C 


PASS-MASK 


7F 


Generate K4ask 


P5 


Yi = P5 

















Noles: 1. This instruction uses the field instruction format (FORMAT 2). 

2. p<i<p + w-1. "p" stands for the position displacement and "w" for the width of bit field. 



Legend: 



Example: 



Unsel «= Unselected Field 
Sel = Selected Field 
A = A Input 
B-B Input 
Q-Q Register 
* = Updated 



0, PASS-MASK, 8, 10 



Generates an 8-bit field mask pattern starting from bit position 10. 
31 18 17 ID 9 



k WW WW NN 
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ABSOLUTE MAXIMUM RATINGS 

Storaae TemDeratnre -65 to +15( 


( 

)°C Commercial (C) 

5°C Temperature 

Siinnlv Vnlta 


}PERATING RANGES 

Case Devices 






Case Tempera! 
SuddIv Voltaae 


ure Under Bias (Tc) ....-55 to -H2 


(Ta) to +70''C 


tn r^rniinH Pntf^ntial 


ie Vr-/- +47R V tn +R9R V 


Continuous .. -0.3 to +7.0 V . devices 

DC Voltage Applied to Outputs Temnarature (T.l - 


55 to -H25°C 


for HIGH Outmit State -0.3 V to Vr-r + O.; 


• Supply Volta 


36 O/nr) +4 






ge -0.3 to Vcc +0.C 




DC Input Volta 
DC Output Gun 
DC Input Curre 

Stresses above 
RATINGS may 
at or above ttie 
maximum ratini 
reliability. 

DC CHARA 

Subgroups 1, 








nt -10 mA to -^-lO mA 55*C 


those listed under ABSOLUTE MAXIMUM „ ..,,.., 
cause permanent device failure. Functionality Operating ranges define those limits between which the 
se limits is not implied. Exposure to absolute functionality of the device is guaranteed, 
js for extended periods may affect device 

CTERISTICS over operating range unless otherwise specified (for APt Products, Group A, 
2, 3 are tested unless otherwise noted) 


Parameter 
Symbol 


Parameter 
Description 


Test Conditions 

(Note 1) 


Min. 


Max. 


Unit 


VOH 


Output HIGH Voltage 


Vcc - Min., 
V|N = V|H or V|L 


Jj3H-0.4-mA 


2.4 




Volts 


Vol 


Output LOW Voltage 


Vcc -Min., 
V|N = V|H or V|L 


fOL ■ftf'mA for 
Y-Sus & 4 mA for 
. All Other Pins 




0.5 


Volts 


V|H 


Guaranteed Input Logical HIGH Voltage 
(Note 2) 




2.0 




Volts 


V|L 


Guaranteed Input Logical LOW Voltage 
(Note 2) 






0.8 


volts 


ilL 


Input LOW Current 


Vcc = Max,;; »|**% 
V|N=0 6W' %, 




-10 


M 


llH 


Input HIGH Current 


Vcc -Max., ,:> 
ViN-'fec-M* 




10 


fiA 


lOZH 


Off State (High Impedance) Output Current 


'.^4T,' 




10 


ma 


lOZL 


vqi=osv 




-10 


Ice 


Static Power Supply Current 

(Note 3) jj» 


1ta;-Wax, 

VlN-* Vcc or GND, 
l0 = (lA 


COM'L 




70 


mA 


MIL 




70 


CpD' 


Power Dissipation Capacitance (Nq^jt)'*;-^ 


#CC = 5.0V, 
'.'Ta = 25»C 
No Load 


pF Typical 


Notes: 1. Vcc conditions shown as Mm. or Max..fe%lo Jtfe Commercial or Military Vcc 'imits. 

2. These input levels provide zero-noise irtftijifllty and should only be statically tested in a noisa-free environment (not functionally 
tested). *?>>L> r -^' --,% 

3. Worst-case Ice is measured at the'Wij^sl Temperature in the specified operating range. 

4. CpD determines the no-load dynamic cdi^t consumption: 

Ice Total) = Irn (Static) + CpgLitest. «Kere f is the switching frequency of the majonty of the internal nodes, normally one-half 
of the clock frequency. %,#'*5' 

*This parameter is rfot tested. /,;.* ..'"■, 
^. " " " "-^" 

■■■.■*-'■"■: .V. 
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SWITCHING CHARACTERISTICS 
A. COMBINATIONAL PROPAGATION 


over COMMERCIAL operating range 
DELAYS 










No. 


From 


To 


29C332 


29C332-1 


29C332-2 


Unit 


! 
1 


Max. Delay 


Max. Delay 


Max. Delay 


1 


PAo-PAs, PB0-PB3 


PERR 


25 


20 


18 


ns 


2 


OA0-DA31, DBo-DBai 


PERR 


32 


28 


23 


ns 


3 


DA0-DA31, DB0-DB31 


PY0-PY3 


59 


42 


34 


ns 


4 


DA0-DA31, DB0-DB31 


Yo-Yai 


49 


35 


28 


ns 


5 


DA0-DA31, DB0-DB31 


C, Z, V, N, L 


60 


43 


*, 34 


ns 


6 


DA0-DA31, DB0-DB31 


MSERR 


68 


49 


'.,40 


ns 


7 


I0-I8 


PY0-PY3 


74 


53 *■ 


%: "43 


ns 


8 


lo-is 


Yo-Ygi 


66 


47 if"' 


%'^' 38 


ns 


9 


lo-ls 


C, Z, V, N, L 


67 


^%.%^ 


# 39 


ns 


10 


I0-I8 


MSERR 


77 


t^%^ 


44 


ns 


11 


W0-W4 


PYo-PYa 


58 


-#%ft.. 


32 


ns 


12 


W0-W4 


Yo-Ysi 


52 


*to,34 ■•■ ■ 


28 


ns 


13 


W0-W4 


C, Z, V, N, L 


57 


■^,,w 


28 


ns 


14 


W0-W4 


MSERR 


62 i 


i%?''' 


33 


ns 


15 


P0-P5 


PYo - PY3 


67 t^. 


^y 46 


39 


ns 


16 


P0-P5 


Y0-Y31 


5^1t ^ 


'^ 42 


34 


ns 


17 


P0-P5 


C, Z, V, N, L 


'^f 


43 


36 


ns 


18 


P0-P5 


MSERR 


,,:s«la^ 


45 


36 


ns 


19 


CP 


PYo-PYg 


\i 


55 


44 


ns 


20 


CP 


Yo-Yai # 


if''^ 


52 


42 


ns 


21 


CP 


C, Z, V, N, ^^ 


% 74 


55 


44 


ns 


22 


CP 


STATUS„fl^a 


h" 28 


25 


20 


ns 


23 


RS 


C, Z, \«rf%. 


23 


21 


17 


ns 


24 


MCin 


Yo;.Y3i *% 


43 


31 


25 


ns 


25 


MCin 


C. 2, », N, L 


48 


34 


28 


ns 


26 


MCin 


, usami' 


52 


37 


30 


ns 


27 


MLINK 


Yo-rai 


46 


33 


27 


ns 


28 


MLiNK f ., ; 


\C,rZ. V, N, L 


52 


37 


30 


ns 


29 


MLINK ,:,;,.: 


gpSERR 


53 


38 


31 


ns 


30 


M/m ,_ '■'%, 


, Y0-Y31 


46 


33 


27 


ns 


31 




C, Z, V, N, L 


52 


37 


30 


ns 


32 


M"^ % ^: 


MSERR 


53 


38 


31 


ns 


33 


BOROW .%3„^, 


Y0-Y31 


46 


33 


27 


ns 


34 


BOROW ,J«%,. ' 


C, Z, V, N, L 


52 


37 


30 


ns 


35 


BOROW%,_ J 


MSERR 


53 


38 


31 


ns 


36 


HOLD . ■**#'■ 


C, Z, V, N, L 


31 


22 


18 


ns 


37 


hold'- f""' 


MSERR 


35 


29 


24 


ns 


38 


PYo-PYj? 


MSERR 


24 


22 


18 


ns 


39 


Y0-Y31 


MSERR 


24 


22 


18 


ns 


40 


C, Z, V, N, L 


MSERR 


24 


22 


18 


ns 


41 


PERR 


MSERR 


24 


22 


18 


ns 
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No. 


Parameter (Note 1) 


For 


With Respect 
To 


29C332 


29C332-1 


29C332-2 


Unit 


Max. Value 


Max. Value 


Max. Value 


42 


Input Data Setup 


DA0-DA31, DB0-DB31 


cp T 


56 


31 


31 


ns 


43 


Input Data Hold 


DA0-DA31, DB0-DB31 


cpT 











ns 


44 


Byte Width Setup 


I7-I8 


cpT 


66 


30 


30 


ns 


45 


Byte Width Hold 


I7-I8 


cpT 











ns 


46 


Instruction Setup 


lo-le 


cpT 


71 


37 


37 


ns 


47 


Instruction Hold 


I0-I6 


cpT 





Oi 





ns 


48 


Width Setup 


W0-W4 


cpT 


64 


m: 


28 


ns 


49 


Width Hold 


W0-W4 


cpT 





■■^*^- ■ - 





ns 


50 


Position Setup 


P0-P5 


cpT 


66 


it n 


28 


• ns 


51 


Position Hold 


P0-P5 


cpT 





%,,..j 





ns 


52 


Borrow Setup 


BOROW 


cpT 


51 ,#'%,:-^i 


22 


ns 


53 


Borrow Hold 


BOROW 


OPT 


<#%, 


« 





ns 


54 


Macro Carry Setup 


MCin 


OP T 


M. 


V 21 


21 


ns 


55 


Macro Carry Hold 


MCin 


CPt 


W^P' 








ns 


56 


Macro Link Setup 


MLINK 


cpT 


^msi.'^. 


22 


22 


ns 


57 


Macro Unk Hold 


MLINK 


cpT J, 










ns 


58 


Macro/Micro Setup 


M/m 


■cpT ,^;' 


IB'-«o 


22 


22 


ns 


59 


Macro/Micro Hold 


M/m 


CP T .# 1 


fc:"' 








ns 


60 


Hold Mode Setup 


HOLD 


CP t:*%. 


2B 


11 


11 


ns 


61 


Hold Mode Hold 


HOLD 













ns 



SWITCHING CHARACTERISTICS over COMMERCIAL operating range (Cont'd.) 

B. SETUP AND HOLD TIMES 



C. MINIMUM CLOCK REQUTREMENTS 



No. 


Description 


•mim 


29C332-1 


29C332-2 


Unit 


'^tot "Value 


Max. Value 


Max. Value 


62 


Minimum Clock LOW Time 


'A , '20 


20 


20 


ns 


63 


Minimum Clock HIGH Time _.'' 


20 


20 


20 


ns 



D. ENABLE AND DISABLE TIMES 



Notes: 1. See tirm^^^ia^flm lor desired mode of operation to determine clock edge to which these setup and 
hol(J.,tiiTiesi^)^ 



No. 


From 


To ,\^ ■ .', 


t-' Description 


29C332 


29C332-1 


29C332-2 


Unit 


Max. Value 


Max. Value 


Max. Value 


64 


OE-Y 


Y0-Y31, PY07PY3 '-'i 


Output Enable Time 








ns 


65 


OE-Y 


Y0-Y31, PYo-'^V3-"»- 


Output Disable Time 








ns 


66 


SLAVE 


C, 2, V,.y, L'«RR 


Slave Mode 
Enable Time 








ns 


67 


SLAVE 


Vo-Yai,P'?'3"*Y3 

C, ■^^;%\. PERR 


Slave Mode 
Disable Time 








ns 
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SWITCHING CHARACTERISTICS over MILITARY 
A. COMBINATIONAL PROPAGATION DELAYS 


operating range 






' 




No. 


From 


To 


29C332 




Max. 
Delay 


Unit 


1 


PA0-PA3, PBo-PBa 


PERR 


28 


ns 


2 


DA0-DA31, DB0-DB31 


PERR 


35 


ns 


3 


DA0-DA31, DB0-DB31 


PY0-PY3 


65 


ns 


4 


DA0-DA31, DB0-DB31 


Y0-Y31 


54 


ns 


5 


DAo-DAsi, DB0-DB31 


C, Z, V, N, L 


66 


ns 


6 


DA0-DA31, DB0-DB31 


MSERR 


75 * 


'• ■^- 


7 


I0-I8 


PY0-PY3 


82,#a 


• "ns 


8 


l0-i8 


Yo-Ysi 


73%, 


ns 


9 


I0-I8 


C, Z, V, N, L 


.,*«!»:* 


ns 


10 


lo-lB 


MSERR 


,*S5 ■■* 


ns 


11 


W0-W4 


PY0-PY3 


' m 


ns 


12 


W0-W4 


Y0-Y31 "1 


wm 


ns 


13 


W0-W4 


C, Z, V, N,:<g,:,„ 


%. 63 


ns 


14 


W0-W4 


MSERR '*"V 


% 68 


ns 


15 


P0-P5 


PY0-PY3 si." 


74 


ns 


16 


P0-P5 


Yo-Y3r - 


65 


ns 


17 


P0-P5 


ci v;-H L 


66 


ns 


18 


P0-P5 


MSERR 


69 


ns 


19 


CP 


PV0-PY3 


82 


ns 


20 


CP „ '=*■ 


Y0-Y31 


75 


ns 


21 


CP \, 


C, Z, V, N, L 


82 


ns 


22 


CP ^icj""^ 


STATUS REG. 


31 


ns 


23 


RS ■'..„„* 


C, Z, V, N, L 


25 


ns 


24 


MQn .,{■ , 


Yo-Yg, 


47 


ns 


25 


MCin '■•%^, J, 


C, Z, V, N, L 


53 


ns 


26 


MCin ::''- ,:"'•■ 


MSERR 


57 


ns 


27 


MLINI^., *,,., ,f 


Y0-Y31 


51 


ns 


28 


mlink'"!,* 


C, Z, V, N, L 


57 


ns 


29 


MiSit** 


MSERR 


58 


ns 


30 


#%!&"'' 


Y0-Y31 


51 


ns 


31 


. 1^^' 


C, Z, V, N, L 


57 


ns 


32 


%M/m 


MSERR 


58 


ns 


33 


■■**ROW 


Y0-Y31 


51 


ns 


34 ■ 


BOROW 


C, Z, V, N, L 


57 


ns 


. 35"^-, " 


BOROW 


MSERR 


58 


ns 


^;3^' 


HOLD 


C, 2, V, N, L 


34 


ns 


SK 


HOLD 


MSERR 


39 


ns 


38 


PYo-PYs 


MSERR 


26 


ns 


39 


Yo-Ysi 


MSERR 


26 


ns 


40 


C, Z, V, N, L 


MSERR 


26 


ns 


41 


PERR 


MSERR 


26 


ns 












i 
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SWITCHING CHARACTERISTICS over MILITARY operating range (Cont'd.) 



B. SETUP AND HOLD TIMES 



No. 


Parameter (Note 1) 


For 


With Respect To 


29C332 


Max. 
Value 


Unn 


42 


Input Data Setup 


DA0-DA31, DB0-DB31 


cpT 


62 


ns 


43 


Input Data Hold 


DA0-DA31, DB0-DB31 


cpT 





ns 


44 


Byte Width Setup 


I7-I8 


cpT 


73 


ns 


46 


Byte Width Hold 


I7-I8 


cpT 





ns 


46 


Instnjction Setup 


I0-I6 


cpT 


7« 


ns 


47 


Instnjction Hold 


lo-le 


cpT 


Si^- ^O--^-- 


ns 


48 


Width Setup 


W0-W4 


CP T ,f 


"t 70 


ns 


49 


Width Hold 


W0-W4 


Cpt u 


'0 


ns 


50 


Position Setup 


P0-P5 


cpT .''-:., 


-' 73 


ns 


51 


Position Hold 


P0-P5 


opt.:. 





ns 


52 


Borrow Setup 


BOROW 


CPt ''.-, 


56 


ns 


53 


Bon-ow Hold 


BOROW 


*tc. 





ns 


54 


Macro Carry Setup 


MCin 


•...cpfr" 


55 


ns 


55 


Macro Carry Hold 


Man 


.^,-*i' 





ns 


56 


Macro Link Setup 


MLINK 


v<-i*T 


47 


ns 


57 


Macro Unk Hold 


MLINK J 


-«--"-cpT 





ns 


58 


Macro/Micro Setup 


M/m 


. -cpT 


55 


ns 


69 


Macro/Micro Hold 


Wm 


- cpT 





ns 


60 


Hold Mode Setup 


HOLD ,, ^ . ^ 


CPt 


31 


ns 


61 


Hold Mode Hold 


HOLD .# A " 


cpT 





ns 



c 


. MINIMUM CLO^,^iQUIREMENTS 




No. 


. Dpcrjption 


29C332 


Max. 
Value 


Unit 


62 


Minimim £5fepyIow Time 


22 


ns 


63 


Milium Ctoci HIGH Time 


22 


ns 







D. ENABLE AND DISABLE TIMES 






No. 


From .' 


To 


Description 


29C332 


Max. 
Value 


Unit 


64 


6^-V-' , .- 


,Yo-Y3i, PY0-PY3 


Output Enable Time 




ns 


65 


6^r.f -. 


' Y0-Y31, PY0-PY3 


Output Disable Time 




ns 


66 


SIAVE 


C, Z, V, N, L PERR 


Slave Mode 
Enable Time 




ns 


67 


SlAlffi 


Y0-Y31, PY0-PY3 
C, Z, V, N, L PERR 


Slave Mode 
Disable Time 




ns 



Notes: 1. See timing diagram for desired mode of operation to determine clock edge to which these setup and 
hold times apply. 
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SWITCHING TEST CIRCUIT 



<=i.=F 



M- 



A. Three-State Outputs 

Notes: 1. Cl-50 pF indudes scope probe, wiring and stray capacitances without device in test fixture. 

2. Si, S2, S3 are closed during function tests and all AC tests except output enable tests. 

3. Si and S3 are closed while S2 is open for tpzH test. 
Si and S2 are closed while S3 is open for tp2L test. 

4. C\_ = TBD for output disable tests. 



SWITCHING TEST WAVEFORMS 



Notes: 1. Diagram shown for HIGH data only. Output transition 
may be opposite sense. 
2. Cross hatched area is don't care condition. 



0*TA_ 


wm wvw 


N':. 


LOWHIGH-LOW 

PULSE ~p 


/ \ 






AAAAW wWv 


w . 


f 


















riMWG _ 


/ 




HIGH-LOWHIGH _\ 
PULSE 


/ 




/ 


V 

WFR02970 


k / 


E 15 y 




Setup, Hold, and Release 


Times 


Pulse Width 


WFR02790 
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SWITCHING TEST WAVEFORMS (Cont'd.) 

Enable 



Disable 



SAME PHASE 
INPUT transition" 



7r=^ 



■PLH- 



■ 3 V 

• 1.5 V 

■ V 

- VOH 



-Jf=^z 



OUTPUT 

NORMALLY 

LOW 



/ 



■ 3 V 

- 1.5 V 

- V 



OPPOSITE PHASE 
INPUT transition" 



\=zl 



• Vol 

• 3 V 

- 1.5 V 
-0 V 



\ 



output 
normally 

*^'<3H SjOPEN 



0.b V 



i/ -0 V 



— . -UM 



WFR02660 



Propagation Delay 



Enable and Disable Times 



Notes: 1. Diagram shown for Input Control Enable-LOW and Input Control 
Disable-HIGH. 
2. Si, S2 and S3 of Load Circuit are closed except where shown. 



Test Philosophy and Methods 

The following points give the general philosophy that we apply 
to tests that must be properly engineered if they are to be 
implemented in an automatic environment. The specifics of 
what philosophies applied to which test are shown. 

1 . Ensure the part is adequately decoupled at the test head. 
Large changes in supply current when the device switches 
may cause function failures due to Vcc changes. 

2. Do not leave inputs floating during any tests, as they may 
oscillate at high frequency. 

3. Do not attempt to perform threshold tests at high speed. 
Following an input transition, ground current may change by 
as much as 400 mA in 5 - 8 ns. Inductance in the ground 
cable may allow the ground pin at the device to rise by 
hundreds of millivolts momentarily. 

4. Use extreme care in defining input levels for AC tests. Many 
inputs may be changed at once, so there will be significant 
noise at the device pins that may not actually reach V|l or 
V|H until the noise has settled. AMD recommends using 
V|L<0 V and V|H>3 V for AC tests. 

5. To simplify failure analysis, programs should be designed to 
perform DC, Function, and AC tests as three distinct groups 
of tests. 

6. Capacitive Loading for AC Testing 

Automatic testers and their associated hardware have stray 
capacitance that varies from one type of tester to another, 
but is generally around 50 pF. This, of course, makes it 
impossible to make direct measurements of parameters 
that call for a smaller capacitive load than the associated 
stray capacitance. Typical examples of this are the so- 
called "float delays" which measure the propagation 
delays into and out of the high impedance state and are 
usually specified at a load capacitance of 5.0 pF. In these 
cases, the test is performed at the higher load capacitance 
(typically 50 pF) and engineering coTelations based on 
data taken with a bench set up are used to predict the 
result at the lower capacitance. 



Similarly, a product may be specified at more than one 
capacitive load. Since the typical automatic tester is not 
capable of switching loads in mid-test, it is impossible to 
make measurements at both capacitances even though 
they may both be greater than the stray capacitance, in 
these cases, a measurement is made at one of the two 
capacitances. The result at the other capacitance is 
predicted from engineering correlations based on data 
taken with a bench set up and the knowledge that certain 
DC measurements (Ioh. tob 'or example) have already 
been taken and are within specifteation. In some cases, 
special DC tests are performed in order to facilitate this 
correlation. 

7. Threshold Testing 

The noise associated with automatic testing, the long, 
inductive cables, and the high gain of bipolar devices when 
in the vicinity of the actual device threshold, frequently give 
rise to oscillations when testing high-speed speed circuits. 
These oscillations are not indicative of a reject device, but 
instead, of an overtaxed test system. To minimize this 
problem, thresholds are tested at least once for each input 
pin. Thereafter, "hard" HIGH and LOW levels are used for 
other tests. Generally this means that function and AC 
testing are performed at "hard" input levels rather than at 
V|L Max. and Vih Min. 

8. AC Testing 

Occasionally, parameters are specified that cannot be 
measured directly on automatic testers because of tester 
limitations. Data input hold times often fall into this catego- 
ry. In these cases, the parameter in question is guaranteed 
by con-elating these tests with other AC tests that have 
been performed. These correlations are arrived at by the 
cognizant engineer by using data from precise bench 
measurements in conjunction with the knowledge that 
certain DC parameters have already been measured and 
are within specification. 

In some cases, certain AC tests are redundant since they 
can be shown to be predicted by other tests that have 
already been performed. In these cases, the redundant 
tests are not performed. 
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SWITCHING WAVEFORK 
KEY TO SWITCHING WAVEFC 


IS 

>RMS 




WAVEFORM INPUTS OUTPUTS 

MUST BE WILL BE 
STEADY STE«>Y 

mm "^"- FEVoH 

UAAAAy DONTCARE: CHANCINC: 
AIIIAA 4NV CHANCE STATE 

■m m CENTER 
\\\ /// DOESNOT LINCISHIGH 

J-U Ui -OFF- STATE 




KS000010 






CP ' 


/ \ 


_J 






— £ ;•— <JD 


»xxxm)oooo( 


xxxxxx 


r* ® ^ '■^-® 1 


■r: mmm 


xxxxxx 


:< @ — 


— ::* U..-@ 


^^ >000000(M_^ 


X)C60C( 


;-. @ 


— S^ :< @ 


wn W4 xxxxxxxxxxxx 


mm 


■^ ® — =* l^-^fl) 1 


'«'• xxxmxxxxxx 


mm 


;-• © — 


=* *-© 


-" xxxxxyxmyxx 


xxxxxx 


:■• ® R :*-® 1 


- )000000000000( 


mxxx 


r* ® =t. i.-® 1 


- xxwamxxM 


xxxxxx 


r" ® — 


^=ti :„-<ID 


- xxxxxxxxxmx 


X)0(XXX 


i'*— ® 


in-ti :.,— @ 


- xmmxwaxxxxxx 


xxxxxx 


Setup and Hold Timing 


WF0236eO 
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SWiTCHiNG WAVEFORMS (Cont d.) 



-"- ESE)( 



PERR 



PVP^s 



Y -Y 
31 



C,Z,N,V,L 



MSERR 



mmm : 



:< — © ©0®©® ^ 



f* -♦•■ 



mcmmm 



^0 @®@®@(§)(g^(§ )(5i)@@ 



mmmmmi c 






<* Pzj ►! 

,AAAA 

Status Register j 



•ssEsssx: 



Propagation Delays (SLAVE = LOW) 

Inputs: PA0-PA3, PBo-P_B3, DA0-DA31, DB0-DB31, Iq-Ib, W0-W4, Pq-Ps. CP, RS, 
MCin, MLINK, M/m, BOROW, HOLD 
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PYo-f^ 



SWITCHING WAVEFORMS (Cont'd.) 

mm 



'i*- 



V^' 



31 xxxx 



C,Z,N, 



-MX 



-® *i 



PERR 



MSERR 



Wi 



-@h 



ffiSSSSSX 



OE-Y 



Yn-Y, 



0''31 
PY0-PY3 



Propagation Delay (SLAVE = HIGH) 



\ 



/ 



r*-<§>*! 



WF023710 



Enable/Disable I (SLAVE = HIGH) 



SUVE 



^0 'X31 

PYo-PYg 

C,Z,V,N,L 

PERR 



V 



/ 



>*-(^>*'. 



> 



WF023720 



Enable/Disable II (OE-Y = LOW) 
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INPUT/OUTPUT CIRCUIT DIAGRAMS 



DRIVEN NPUT 



ML 



'""if* 



Vcc 



OUTPUT 



-< 



lOH 



c» 



rz 



C| « 5.0 pF, all Inputs 



Co * 5.0 pF, all outputs 
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Am29C334 

CMOS Four-Port Dual-Access Register File 



PRELIMINARY 



DISTINCTIVE CHARACTERISTICS 



64 X 18 Bit Wide Register File 

The Ann29C334 is a 64 x 18-bit, dual-access RAM with 

two read ports and two write ports. 

Pipelined Data Path 

The Am29C334 can be configured to support either a 

non-pipelined data path (similar to the Am29334) or a 

pipelined data path. 

Cascadable 

The Am29C334 is cascadable to support either wider 

word widths, deeper register files, or both. 



Built In Forwarding Logic 

The Am29C334 provides simultaneous read/write ac- 
cess to the same address for double pipelined systems. 
Byte Parity Storage 

Width of 1 8 bits facilitates byte parity storage for each 
port and provides consistency with the Am29C332 
32-bit ALU. 

Byte Write Capability 

Individual byte-write enables allow byte or full word 
write. 



> 
to 

(O 

O 
u 

CO 



BLOCK DIAGRAMS 




Non-Pipelined Mode 




Pipelined Mode 
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Publication # Rev. Amenijment 

08786 B /O 

Issue Date: December 1987 



GENERAL DESCRIPTION 



The Am29C334 is a 64-word by 18-bit dual-access RAM witli 
two read ports and two write ports. Two independent, simuita- 
neous accesses are possible and each access can be either a 
read or a write. It is designed to be used in a system that 
requires as many as two reads and two writes in a single cycle. 
The device can be configured to support either a non- 
pipelined data path or a pipelined data path. 



The Am29C334 is also fully compatible with the bipolar 
Am29334. When the device is connected to the pinout 
specified for the Am29334, it will appear as a 64-word by 18- 
bit array without support for pipelined operation. The pipelined 
operation of the Am29C334 is made possible because of the 
availability of unused power pins not required by the C MOS 
part. The pipelined operation is disabled by attaching the PIPE 
pin to Vcc- 



RELATED AMD PRODUCTS 


Part No. 


Description 


Am29C323 


CMOS 32-Bit Parallel Multiplier 


Am29325 


32-Bit Floating Point Processor 


Am29C325 


CMOS 32-Bit Floating Point Processor 


Am29331 


16-Bit Microprogram Sequencer 


Am29C331 


CMOS 16-Bit Microprogram Sequencer 


Am29332 


32-Bit Extended Function ALU 


Am29C332 


CMOS 32-Bit Extended Function ALU 


Am29334 


64x18 Four-Port Dual-Access Register File 


Am29337 


16-Bit Bounds Checker 


Am29338 


128 x 9 Byte Queue 
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CONNECTION DIAGRAM 
120 Lead PGA* 



« L " H 



A«AI AAA; AWA1 DAM DAD2 DAOl 



DA12 DA16 LEA wEaC W?AL\ 



ARA3 AWA3 ARAI AflAC OA03 [MOi DAOT DA1D 0A13 



ARA& AHAS WEAtt 



V80D AWAO DAOI GKD 



PIPE DA1I DAI4 DAir 



YBOI YBO; VB03 



GUDA VeOA YBOS 



YBa7 VBM VCCA 



eHDA VBt3 VSM 



VBtS VBIB YB17 



WEBL WEBH DBOI OBOJ VCC 



YA03 VAOA GWA 



Bli GND ARBC 



YA1I VA13 GWA 



ARBS AWB3 



lEB DBOO DSD3 VCC DB05 OBn DB1I GMD OBIT AWBO AWB2 AHB1 
ARB5 PB07 DBll? VCC BBOI DBlO DB14 BKD BBtB DB13 ARB1 AWBl 



CD010320 



*Pins facing up. 



TABLE OF INTERCONNECTIONS 

(Sorted by Pin Name) 



PIN NAME 



Arao 
Arai 

ArA2 
AraS 
AhA4 
ArA5 

Arbo 
Arbi 
Arbj 

Arb3 
Arb4 

Arbs 

AWAO 
AWAI 
AWA2 
AwA3 
AWA4 
AWAS 
AwBO 

Awbi 

AWB2 
AWB3 
AWB4 
AwB5 

Daoo 
Daoi 

Da02 



PIN 
NO. 


PAD 
NO. 


PIN 






Da03 






Da04 






Daos 






Da06 


D02 


63 


Da07 


C02 


62 


Da08 


B01 


61 


DaD9 


A02 


120 


Daio 


B03 


119 


Daii 


L02 


74 


Da12 


K11 


92 


Da13 


Ml 3 


91 


Da14 


N12 


90 


Da15 


M11 


89 


Da16 


M03 


77 


Da17 


B13 


105 


Dboo 


D03 


3 


Dboi 


C01 


2 


Db02 


A01 


1 


Dbos 


B02 


60 


Db04 


A03 


59 


Dbos 


M02 


15 


DsOfi 


L12 


32 


Db07 


N13 


31 


Dbos 


M12 


30 


Db09 


N11 


29 


Dbio 


N03 


17 


Db11 


A13 


46 


Dbi2 


D01 


4 


Dbi3 


EOS 


64 


Db14 


E01 


5 


Db15 



PIN 
NO. 



E02 

F01 

F02 

G03 

G02 

G01 

H01 

H02 

JOS 

J01 

J02 

K03 

K02 

K01 

L03 

C12 

C11 

D13 

D12 

D11 

F12 

F13 

CI 3 

F11 

011 

G13 

G12 

H12 

LI 3 

H13 

H11 



PAD 
NO. 



65 
6 
66 
7 
67 
9 
69 
10 
70 
11 
71 
12 
72 
13 
73 

104 
44 

103 
43 

102 
42 

101 
41 

100 
40 
96 
36 
95 
35 
94 
34 



PIN NAIME 



Db16 

Db17 

GND 

GND 

GND 

GND 

GNDA 

GNDA 

GNDA 

GNDA 

LEa 

LEp 

S^* 

5Eb 

PIPE 

Vcc 
Vcc 
Vcc 

VCCA 
VCCA 
WEac/CLKa 

weah 
weal 

WEbc/CLKb 

WEbH 

WEbl 

Yaoo 
Yaoi 

Ya02 
YA03 
Ya04 



PIN 
NO. 



K13 
K12 
F03 
J11 
J12 
J13 
N05 
N09 
A09 
A05 
L01 
812 
L06 
COS 
H03 
Ell 
E12 
E13 
LOS 
C06 
M01 
N02 
N01 
A12 
B11 
All 
L04 
M04 
N04 
LOS 
M05 



PAD 
NO. 



93 
33 
8 
37 
38 
39 
20 
26 
50 
56 
14 
45 
23 
53 
68 
97 
98 
99 
83 
113 
75 
76 
16 
106 
107 
47 
18 
78 
19 
79 
80 



PIN NAME 



PIN 
NO. 



YA05 
Ya06 
YA07 
Va08 
Ya09 

Yaio 
Yaii 

Ya12 
Ya13 
Ya14 

Yais 

Ya16 
Ya17 
YboO 

Yboi 

Yb02 
Yb03 
Yb04 

Ybos 

Yb06 
Yb07 

Ybos 
Ybo9 
Ybio 
Ybii 
Ybi2 

Yb13 
Yb14 
Yb15 
Yb16 
Yb17 



N06 

M06 

L07 

M07 

N07 

N08 

l\108 

L09 

M09 

L10 

l\^10 

N10 

L11 

C03 

A04 

B04 

C04 

BOS 

005 

B06 

A06 

A07 

B07 

C07 

BOS 

AOS 

B09 

C09 

A10 

BIO 

CIO 



PAD 
NO. 



21 
81 
22 
82 
24 
84 
25 
85 
86 
27 
87 
28 
88 

118 
58 

117 
57 

116 

115 
55 

114 
54 

112 
52 

111 
51 

110 

109 
49 

108 
48 
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TABLE OF INTERCONNECTIONS (Cont'd.) 








(Sorted by 


Pin No.) 




PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 1'^ 








COS 


Yb05 


115 


H02 


Daio 


10 


MOS 


Ya04 


80 








COS 


VCCA 


113 


H03 


PIPE 


68 


M06 


Ya06 


81 








CO? 


Ybio 


52 


H11 


Db15 


34 


M07 


Ya08 


82 








COS 


OEb 


53 


H12 


Dbi2 


95 


MOS 


Yah 


25 


A01 


AwA2 


1 


C09 


Yb14 


109 


HIS 


Db14 


94 


MOS 


Ya13 


86 


A02 


ArA3 


120 


CIO 


Ybi7 


48 


J01 


Da12 


11 


M10 


Ya15 


87 


A03 


AwA4 


59 


C11 


Dboi 


44 


J02 


Dai 3 


71 


Mil 


ArB3 


89 


A04 


Yboi 


58 


C12 


Dboo 


104 


JOS 


Daii 


70 


M12 


AwB2 


30 


A05 


GNDA 


56 


C13 


Db07 


41 


J11 


GND 


37 


M1S 


Arbi 


91 


A06 


Yb07 


114 


t)01 


Daoo 


4 


J12 


GND 


38 


N01 


WEal 


16 


A07 


Yb08 


54 


D02 


AraO 


63 


J13 


GND 


39 


N02 


WEah 


76 


A08 


Yb12 


51 


D03 


AwAO 


3 


K01 


Dai 6 


13 


N03 


AwB4 


17 


Aog 


GNDA 


50 


D11 


Db04 


102 


K02 


Da15 


72 


N04 


Ya02 


19 


A10 


Yb15 


49 


D12 


Dbo3 


43 


K03 


Da14 


12 


N05 


GNDA 


20 


A11 


WEbl 


47 


D13 


Db02 


103 


K11 


Arbo 


92 


N06 


Ya05 


21 


A12 


WEbc/CLKb 


106 


E01 


Da02 


5 


K12 


Db17 


33 


N07 


Ya09 


24 


A13 


AwB5 


46 


E02 


Dags 


65 


K13 


Db16 


93 


N08 


Yaio 


84 


B01 


Ara2 


61 


EOS 


Daoi 


64 


L01 


LEa 


14 


N09 


GNDA 


26 


B02 


AwA3 


60 


E11 


Vcc 


97 


L02 


Aras 


74 


N10 


Ya16 


28 


BOS 


Ara4 


119 


E12 


Vcc 


98 


LOS 


Dai 7 


73 


N11 


AWB3 


29 


B04 


Yb02 


117 


E13 


Vcc 


99 


L04 


Yaoo 


18 


N12 


ArB2 


90 


805 

B06 


Vb04 
Yb06 


116 

55 


F01 
F02 


Da04 
Da05 


6 
66 


LOS 
L06 


Ya03 
OEa 


79 
23 


N13 


AwBI 


31 






B07 


Yb09 


112 


F03 


GND 


8 


L07 


Ya07 


22 






608 


Ybii 


111 


F11 


Dbob 


100 


LOS 


VcCA 


83 






809 


Yb13 


110 


F12 


Db05 


42 


LOS 


Yai2 


85 






BIO 


Ybi6 


108 


F13 


Db06 


101 


L10 


Ya14 


27 






B11 


WEbh 


107 


G01 


Da08 


9 


L11 


Ya17 


88 
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UEb 


45 


G02 


Da07 


67 


L12 


Aw BO 


32 






B13 


Arb5 


105 


GOS 


Da06 


7 


L13 


D B13 


35 






C01 


AWA1 


2 


Gil 


Db09 


40 


M01 


WEac/CLKa 


75 






C02 


Arai 


62 


G12 


Dbii 


36 


M02 


Aw AS 


15 






COS 


Yboo 


118 


G13 


Dbio 96 


M03 


ArB4 


77 






C04 


Yb03 


57 


HOI 


Dao9 69 


M04 


Yaoi 


78 
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ORDERING INFORMATION 
Standard Products 



AMD standard products are available in several packages and operating ranges. The order number (Valid 
Combination) is formed by a combination of: a. Device Number 

b. Speed Option (if applicable) 

c. Package Type 

d. Temperature Range 

e. Optional Processing 

AM29C334 -1 



-e. OPTIONAL PROCESSING 

Blank = Standard processing 



-d. TEMPERATURE RANGE 

C - Commercial (0 to + 70°C) 



-c. PACKAGE TYPE 

G = 120-Lead Pin Grid Array without Heatsink 
(CGX120) 



-a. DEVICE NUMBER/DESCRIPTION 

Am29C334 

CMOS Four-Port Dual-Access Register File 



-b. SPEED OPTION 

- 1 = Speed Select 



Valid Combinations 



Valid Combinations 


AM29C334 


GO, GCB 


AM29C334-1 



Valid Combinations list configurations planned to be 
supported in volume for this device. Consult the local AMD 
sales office to confirm availability of specific valid 
combinations, to check on newly released valid combinations, 
and to obtain additional data on AMD's standard military 
grade products. 
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ORDERING INFORMATION (Cont'd.) 
APL Products 



AMD products for Aerospace and Defense applications are available in several packages and operating ranges. APL (Approved 
Products List) products are fully compliant with MIL-STD-883C requirements. The order number (Valid Combination) for APL 
products is formed by a combination of: a. Device Number 

b. Speed Option (if applicable) 

c. Device Class 

d. Package Type 

e. Lead Finish 



LEAD FINISH 

C = Gold 



-d. PACKAGE TYPE 

Z = 120-Lead Pin Grid Array without Heatsink 
(CGX120) 



-c DEVICE CLASS 

/B = Class B 



b. SPEED OPTION 

Not Applicable 



-a. DEVICE NUMBER/DESCRIPTION 

Am29C334 

CMOS Four-Port Dual-Access Register File 



Valid Combinations 



AM29C334 



/BZC 



Valid Combinations 

Valid Combinations list configurations planned to be 
supported in volume for this device. Consult the local AMD 
sales office to confirm availability of specific valid 
combinations or to check for newly released valid 
combinations. 

Group A Tests 

Group A tests consist of Subgroups 
1, 2, 3, 7, 8, 9, 10, 11. 
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PIN DESCRIPTrON 



Arao-Aras Read Address A-Side (rnput) 

The 6-bit read address input selects one of the 64 memory 
locations for output to the Ya Data Latch. 

Arbo-Arbs Read Address B-Side (Input) 

The 6-bit read address input selects one of the 64 memory 
locations for output to the Yb Data Latch. 

AwA0-AwA5 Write Address A-Side (Input) 

The 6-bit write address input selects one of the 64 memory 
locations for writing new data from the Da input. 

AwB0-AwB5 Write Address B-Slde (Input) 

The 6-bit write address input selects one of the 64 memory 
locations for writing new data from the Db input. 

Dao - Dai7 Data A-Slde (Input) 

New data is written into memory from this input, as selected 
by the Awa address input. 

Dbo-Obi7 Data B-Sfde (Input) 

New data is written into memory from this input, as selected 
by the Awb address input. 

GND, Vcc Power 

Power supply for the internal logic (0, 5 V). 
GNDA, VccA Power 
Power supply for the output drivers (0, 5 V). 

LEa Ya Data Latch Enable (Input, Active HIGH) 

The LEa input controls the latch for the Ya output port. 
When LEa is HIGH, the latch is open (transparent) and data 
from the RAM, as selected by the Ara address inputs, is 
passed to the Ya output. When LEa is LOW, the latch is 
closed and it retains the last data read from the RAM. LEa is 
disabled in the pipelined mode. 

LEb Yb Data Latch Enable (Input, Active HIGH) 

The LEb input controls the latch for the Yb output port. 
When LEb is HIGH, the latch is open (transparent), and data 
from the RAM, as selected by the Arb address inputs, is 
passed to the Yg output. When LEb is LOW, the latch is 
closed and it retains the last data read from the RAM. LEa is 
disabled in the pipelined mode. 

OEa Ya Output Enable (input. Active LOW) 

When OEa is LOW, data in the Ya Data Latch is driven on 
the Ya output. When SEa is HIGH, Ya output is in the high- 
impedance (off) state. 

5Eb Yb Output Enable (input. Active LOW) 

When CEb is LOW, data in the Yb Data Latch is driven on 
the Yb outputs. When OEb is HIGH, Yb output is in the high- 
impedance (off) state. 



PIPE Pipelin e Enable (Input, Active LOW) 

When PIPE is LOW, the input and output registers are 
enabled, allowing for pipelined operation. When HIGH, 
these registers are made transparent. 

WIac/CLKa Write Enable A-Side Common (Input, 
Active LOW) 

When WEac is LOW together with WEah or WEal, new 
data is written Jnto the location selected by the AWa 
address. When WEac is^lIGH, no data is written into the 
RAM through the A port. WEac acts as a clock input in the 
pipeline mode for the A side. 

WEbc/CLKb Write Enable B-Side Common (input, 
Active LOW) 

When WEbc is LOW together with WEbh or WEbl, new 
data is written into the location selected by the AWb 
address. When WEbq is jjIGH, no data is written into the 
RAM through the B port. WEbc acts as a clock input in the 
pipeline mode for the B side. 

WEah High-Byte Write Enable A-Side (Input, Active 
LOW) 

When WEah is LOW together with WEac, new data is 
written into the high byte^the location selected by the 
AWa address input. When WEah is HIGH, no data is written 
into the high byte. 

WEbh High-Byte Write Enable B-Side (Input, Active 
LOW) 

When WEbh is LOW together with WEbc, new data is 
written into the high byte^the location selected by the 
AWb address input. When WEbh is HIGH, no data is written 
into the high byte. 

WEal Low-Byte Write Enable A-Slde (Input, Active 
LOW) 

When WEal is LOW together with WEac, new data is 
written into the low byte of the location selected by the AWa 
address input. When WEal is HIGH, no data is written into 
the low byte. 

WEbl Low-Byte Write Enable B-Side (Input, Active 
LOW) 

When WEbl is LOW together with WEsc. new data is 
written into the low byte of the location selected by the AWb 
address input. When WEbl is HIGH, no data is written into 
the low byte. 

Yao-Yai7 Data Latch (Outputs, Three-State) 

The 18-bit Ya Data Latch outputs. 

Ybo-Ybi7 Data Latch (Outputs, Three-State) 

The 18-bit Yb Data Latch outputs. 
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FUNCTIONAL DESCRIPTION 

The heart of the Am29C334 is a high-speed 64-word by 18-bit 
dual RAM cell array. Six write enables permit the RAM word to 
be written in one or both of its 9-bit bytes. Data to be written is 
presented to each side of the RAM array through the two data 
ports (Da and Db). 

The remainder of the logic surrounding the RAM array 
supports pipelining the RAM access and providing a fonward- 
ing path for data around the RAM. This fonivarding path is 
needed to eliminate the latency cycle associated with consec- 
utive write/read accesses to the same memory location in a 
pipelined system. 

Pipelining of the RAM is controlled by the PIPE pin. When not 
asserted (i.e., in non-pipelined mode) the registers on the 
inputs (write ports Da/b. write addresses Awa/B. and write 
enables WEac/BC) are made fully transparent, while the 
registers at the outputs (the read ports Ya/b) are turned into 
latches, controlled by the latch enables LEa/b- 

In either mode of operation, each side of the RAM is controlled 
by its individual control signals. This means that the two sides 
of the RAM can operate at different clock rates to one 



another. In the pipelined mode, these dock rates must have a 
known relationship tietween each other. 

In the non-pipejined mode, there is no need for a relationship 
between the clock rates. Two special cases of operation arise 
because of this. The first is where the location written to by 
one side is being read from the other side. In this case, known 
as A-to-B transparency, the value read is the value being 
written. The second occurs when two writes to the same 
location occur at the same time. In this case the value written 
can not be defined, but the operation is not hamnful to the 
device. 

The transparency mode (A-A or B-B) during a write 

(WEa = LOW) allows the data in (Da) to not only be written 

into memory, but also to appear at the output (Ya) when the 

output latch (LEa) is HIGH arid the output enable control 

(OEa) is LOW. 

Extensions to Four Read Ports and Two Write 

Ports 

A RAM with four read ports and two write ports can be made 
by using two dual-access RAMs and connecting each of the 
write ports, write addresses, and write enables in parallel for 
the two devices. Figure 2 details this in a non-pipelined mode. 



Am29C331 

16-BIT 

SEQUENCER 



"TIT 



Am29C334 

REGISTER 

FILE 

64x18 



MICROPROGRAM 
MEMORY 



PIPEUNE 
REGISTER 



Ani29C325 

32-BIT 

FLOATING POINT 

PROCESSOR 



Am29C332 

32-Brr 

ALU 



CONTROL 
SIGNALS 



Ain29C323 

32x32 
PARALLEL 
MULTIPUER 



Figure 1. Am29C300 CMOS Family High-Performance System Blocit Diagram 



32 Word X 36 Bit Singie-Access RAM 

It is possible to convert the 64 word x 1 8 bit dual-access RAM 
into a 32 word x 36 bit single-access RAM. This is performed 
by storing the upper half of the 36 bits in the upper half of the 
64 virards and addressing these from the A side, and storing 
the lower half of the 36 bits in the lower half of the 64 words 
and addressing these from the B side. This arrangement does 
not change the capacity of the RAM, but the dual access is 
lost (see Figure 4). 

Operational Modes 

The Am29C334 may be configured in a non- pipelined mode or 
in a pipelined mode by controlling the PIPE pin. This mode is 
selected via hardwiring the pin to either LOW or HIGH. This 
option should not be changed during operation. 



Non-Plpeiined Data Path 

In non-pipelined mode (PIPE = 1), the Am29C334 is a flow- 
through device; data is read out, used, and written back all in 
the same cycle. In this mode all the registers are made 
transparent except the registers at the two read ports that are 
configured as latches. The read port latches are controlled 
individually by the LEa and LEb, so that they are transparent 
when the latch enables are HIGH and retain the data when the 
latch enables are LOW. The "fonwarding logic" incorporated 
to support the pipelined mode of operation is also disabled in 
this mode of operation (specifically, the address comparators 
are disabled). 

In the non-pipelined mode of operation it is possible to 
simultaneously read two ports, read one port and write to the 
other, or write to two ports, coricurrently. The read and write 
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addresses are internally multiplexed on each side. The selec- 
tion of the read an d writ e addresses is controlled by the 
exclusive-OR of the PIPE pin and WEac/bc Normally, the 
WEac/bc are connected to the system clock. With PTFE de- 
asserted, the read address will be selected in the high part of 
the clock cycle (WEac/bc = 1) and the write address selected 
only in the low part. Byte selection for writing on either ports is 
controlled by the WEh/l pins. 

Two interesting cases arise as a result of the dual access 
capability. The first occurs if a location is written into by one 
side while it Is being read out by the other side. In this case, 
known as A-to-B transparency, the data being written will 
appear on the read port after the TransparencyAB time (if 
other read access time parameters are met). The second case 
of interest occurs if both sides write to the same location at the 
same time. The value written as a result of this operation 
cannot be defined. 

Pipelined Data Path 

The Am29C3 34 ca n be c onfigu red in a pipelined system by 
asserting the PITC signal (PIPE = 0) and adding an additional 
external register in the write address and the write control path 
on both A and B ports as shown in Figure 3, The registers on 
each side are controlled by separate docks that are supplied 
over the WEac and wEbc pins. 

Typically, in a pipelined system a read - modify - write would 
span three cycles. In the second half of the first cycle, a read 
of the operand(s) is performed and the data is clocked into the 
output registers at the end of the cycle. In the second cycle, 
the operation is performed on the operands and the result is 
clocked into the data register on the write port at the end of 
the second cycle. In the first half of the third cycle, the data is 
written to the register file. Therefore, in any cycle, a pipelined 
system is writing the result of instruction n (in the first half). 
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DUAL 
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RAM 



ADORESS ADDER 




executing instruction n + 1, and reading the operands needed 
in instnjction n -f 2. In any case, a write operation followed by 
a read operation is performed in the RAM in a cycle. 

A special case arises if the data to be written by the previous 
instrucfion is needed in the next instruction as an operand. 
Due to the pipeline register being at its write port, the location 
is not written into until the next cycle, and hence only the 
previous value is available in the cun-ent cycle. To overcome 
this problem, "fonwarding logic" is included as shown in the 
block diagram. This logic consists of three elements: an 
address comparator, an AND gate, and a three-to-one multi- 
plexer, as shown. If the read address of the current instruction 
is the same as the write address of the previous instruction, 
and if the result is to be written, then the data to be written is 
forwarded by the fonwarding multiplexer to the output regis- 
ters. Since there are two write ports, fonwarding paths on both 
ports are provided. As each write port has byte write capability, 
the fonftfarding is further broken into the upper and lower 
bytes. 

Since each side has its own WEc/CLK control, it is possible to 
clock each side of the chip differently. However, if the part is 
used at different frequencies, the fora/arding cannot be 
guaranteed unless the addresses compared are held valid 
long enough to allow for a comparison to be made and the 
results of the fonvarding setup on the output register. 

As mentioned earlier, it is necessary to use an external write 
address and write control registers in a pipelined system. 
These registers have not been included for two reasons. First, 
it is possible for the user to abort the writing before it fills the 
internal pipe. This situation may arise in cases such as in 
"traps." Second, by providing an external write address 
register it provides the flexibility of obtaining the write address 
ft-om several sources by using an external multiplexer. 
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Figure 2. RAIW with Four Read Ports and Two Write Ports for Non-pipelined Mode 
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Figure 3. System Diagram With tlie Am29C334 in a Doubie Pipeiined Data Patli 
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ABSOLUTE MAXIMUM RATINGS 

Storage Temperature -65 to +150°C 

Temperature Under Bias - Tc -55 to +125°C 

Supply Voltage to Ground Potential 

Continuous -0.3 to +7.0 V 

DC Voltage Applied to Outputs 

for HIGH Output State -0.3 V to +Vcc + 0.3 V 

DC Input Voltage -0.3 V to +Vcc + 0.3 V 

DC Output Cun-ent, Into LOW Outputs 30 mA 

DC Input Current -10 mA to +10 mA 

Stresses above those listed under ABSOLUTE MAXIMUM 
RATINGS may cause permaner}t device failure. Fur)ctionaTity 
at or above these limits is not implied. Exposure to absolute 
maximum ratings for extended periods may affect device 
reliability. 

DC CHARACTERISTICS over operating range unless othenwise specified (for APL Products, Group A, 
Subgroups 1, 2, 3 are tested unless othenvise noted) 



OPERATING RANGES 

Commercial (C) Devices 

Temperature (Ta) to +70°C 

Supply Voltage +4.75 to +5.25 V 

Military* (M) Devices 

Temperature (Ta) -55 to +125'C 

Supply Voltage (Vcc) +4.5 to +5.5 V 

Operating ranges define tfiose limits between which the 
functionality of the device Is guaranteed. 

* Military product 100% tested at Ta = +25°C, +125''C, and 
-55°C. 



Parameter 
Symbol 



VOH 



Vol 



V|H 



V|L 



l|H 



lOZH 



lOZL 



Ice 



CpD 



Parameter 
Description 



Output HIGH Voltage 



Output LOW Voltage 



Input HIGH Level 



Input LOW Level 



Input LOW Current 



Input HIGH Current 



Off State (High-Impedance) 
Output Current 



Static Power Supply Current 



Power Dissipation Capacitance 
(Note 3) 



Test Conditions 

(Note 1) 



Vcc = Min. 
V|N - V|L or V|H 
I0H--4 mA 



Vcc = Min. 

V|N = ViL or V|H 

lOL = 8 mA 



Guaranteed Input Logical 
HIGH Voltage (Note 2) 



Guaranteed Input Logical 
LOW Voltage (Note 2) 



Vcc = Max. 
V|N = 0.6 V 



Vcc = Max. 

V|N = Vcc-0.5 V 



Vcc " Max. 



V|N - Vcc or GND 
Vcc - Max 
IO = )lA 



Vo - 2.4 V 



Vo = 0.5 V 



Ta = -55 to 125°C 



Ta = to + 70°C 



Vcc = 5.0 V 

Ta = 25°C No Load 



Min. 



Max. 



Unit 



Volts 



M 



M 



IjA 



900 pF Typical 



Notes: 1. Vcc conditions shown as Min. or Max. refer to the commercial (i5%) Vcc limits. 

2. These input levels provide zero-noise immunity and should only be statically tested in a noise-free environment (not functionally 

v9Sl6uJ* 

3. CpD determines the no-load dynamic current consumption: 

Ice (Total) = Ice (Static) + Cpd Vcc f. where f is the switching frequency of the majority of the internal nodes, nomiallv one-half 
of the clock frequency. This specification is not tested. 
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'' SWITCHING CHARACTERISTICS over COMMERCIAL operating range unless otherwise specified 
NON-PIPELINED MODE (Note 1) 
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SWITCHING CHARACTERISTICS over MILITARY operating range unless othenwise specified (for APL 
Products, Group A, Subgroups 9, 10, 11 are tested unless othenvise noted) 

NON-PIPELINED MODE (Note 1) 


No. 


Parameter 


Description 


Test Conditions 


29C334 


Unit 


Min. 


Max. 


1 


Access Time 


Ara or Arb to Ya or Ye 


LEa or LEb = H 




40 


ns 


2 


Access Time 


WEac or WEbc to Ya or 

Yb 


LEa or LEb = H 





37 


ns 


3 


Turn-On Time 


OEa or OEb i to Ya or 
Yb Active 







16 


ns 


4 


Tum-Otf Time 
(Note 2) 


OEa or OEb t to Ya or 
Yb = High Impedance 





25 


ns 


5 


Enable Time 


LEa or LEb t to Ya or 
Yb 





21 


ns 


6 


Transparency 


WEa or WEb i to Ya or 

Yb 


LEa or LEb = H 





47 


ns 


7 


Transparency 


Da or Db to Ya or Yb 


LEa or LEb = H, 
WEa or WEb = l- 




47 


ns 


8 


Write Recovery Time 


Ara or Arb to WEac or 
WEbc 






(2)-(1) 


ns 


9 


Data Setup Time 


Da or Db to WEa or WEb t 


19 




ns 


10 


Data Hold Time 


Da or Db to WEa or WEb t 


2 




ns 


11 


Address Setup Time 


Awa or Awb to WEa or WEb » 


4 




ns 


12 


Address Hold Time 


AwA or Awb to WEa or WEb t 


2 




ns 


13 


Address Setup Time 


Ara or Arb to LEa or LEb 1 


23 




ns 


14 


Address Hold Time 


Ara or Arb to LEa or LEb 1 


1 




ns 


15 


Latch Close Before 
Write 


LEa or LEb to WEa or WEb 1 







ns 


16 


Read Before Latch 
Close 


WEac or WEbc to LEa or LEb 1 


24 




ns 


17 


Write Pulse Width 


WEa or WEb (1-OW) 


23 






18 


Latch Data Capture 
Pulse Width 


LEa or LEb (HIGH) 


17 




ns 


Notes: 1 . WEa = WEac + WEal/h 

web = wEbc + webl/h 

2. Ya and Yg are tested independently. 

3. Minimum delays are not tested. 
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SWITCHING WAVEFORMS 
NON-PIPELINED MODE 



% XXXX! ( 



"\ 



<-© 



-®- 



-@- 



©■ 



-<r>- 



^IMS)[ 



■^. 



V 



MM 



Read Function (* means A or B) 



ffl( 



■* — & 



\ f 



fflffifflC 



Ifflffiffl 



-® 



)ffiMfflffi 



WF023340 



Write Function (* means A or B) 



W-c*Wl.L,H 



\ 



4 ©- 






)(mmmm 



« — ©- 



NOTE: LE, = HIGH 
5E. - LOW 
' means A or B 



Transparency 
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SWITCHING CHARACTERISTICS over COMMERCIAL operating range (Cont'd.) 
PIPELINED MODE 


No. 


Parameter 


Description 


29C334 


29C334-1 


29C334-2 


Unit 


MIn. 


Max. 


Min. 


Max. 


Min. 


Max. 


19 


Write Data Setup Time 


Da or Db to CLKa or CLKb t 


15 ,. 


13-4 


13 fc 




ns 


20 


Write Data Hold Time 


Da or Db to CLKa or CLKb ! 


1 «*"'■" 


1^ 


, 


1 ^*- 


ns 


21 


Write Address Setup 
Time 


AwA or AwB '0 CLKa or CLKb t 


23 l*m 


2£» 


20 li»«,i> 


ns 


22 


Write Address Hold 
Time 


AwA or AyvB to CLKa or CLKb I 


°!^ 


^r^ 

-y 




ns 


23 


Write Enable Setup 
Time 


WEh or WEl to CLKa or CLKb t 


5^ 

20 Avxm 


dm 
l^w 


16 fc 


ns 


24 


Write Enable Hold Time 


WEh or WEl to CLKa or CLKb T 


3> 


°3* 


°«> 


ns 


25 


Read Address Setup 
Time 


Ara or Arb to CLKa or CLKb t 


24 "-I 




2o«««t; 
I 


ns 


26 


Read Address Hold 
Time 


Ara or Arb to CLKa or CLKb I 


° M 


°hi 


-■ ) • 


ns 


27 


Minimum Clock Cycle 


CLKa or CLKb (LOW) 


50 ir 


^c* 


40 K M*-* 


ns 


28 


Minimum Clock Pulse 


CLKa or CLKb (HIGH) 


^^ 


Si 


1^ 




^^5 




ns 


29 


Minimum Clock Pulse 


CLKa or CLKb (LOW) 


17 M= 


i£w. 


14 li.. 


ns 


30 


Clock to Y 


Ya or Yb to CLKa or CLKb 


14 




12 




10 




ns 
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SWITCHING CHARACTERISTICS over MILITARY operating range (Cont'd.) 
PIPELINED MODE 



No. 



Parameter 



Description 



29C334 



Min. 



Max. 



Unit 



Write Data Setup Time 



Da or Db to CLKa or CLKb t 



19 



20 



Write Data Hold Time 



Da or Db to CLKa or CLKb 1 



21 



Write Address Setup Time 



AwA or Awb to CLKa or CLKb t 



27 



22 



Write Address Hold Time 



Awa or AwB to CLKa or CLKb t 



23 



Write Enable Setup Time 



WEh or WEl to CLKa or CLKb t 



23 



24 



Write Enable Hold Time 



WEh 01" WEl to CLKa or CLKb 



25 



Read Address Setup Time 



Ara or Arb to CLKa or CLKb t 



28 



26 



Read Address Hold Time 



Ara or Arb to CLKa or CLKb t 



27 



Minimum Clock Cycle 



CLKa or CLKb (LOW) 



55 



28 



Minimum Clock Pulse 



CLKa or CLKb (HIGH) 



20 



29 



Minimum Clock Pulse 



CLKa or CLKb (LOW) 



20 



30 



Clock to Y 



Ya or Yb to CLKa or CLKb 
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SWITCHING WAVEFORMS (Cont'd.) 
PIPELINED MODE 



CLK. 



Arj. 



y ■* (0)- ^\ ' * @ — *i 



m( 



WIk 



WE, 



H,L 






©-♦ 



PREVIOUS DATA 



-@ ► 



-*-@ 



) mmm nmsm 



'*-© 



) mmm) n, mssm 



■•-® 



^ QOC(iOOOOOO<n (TOOT 



-*-© 



) ommt) r.msm 



p 



NEW DATA 



)SMffl 



means A or B 



:a / 



CLOCK 



A f 



INPUT 
_ TO _ 
OUTPUT 
TO DELAY 

"OUTPUT 
DELAY 



WFR02990 
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SWITCHING TEST CIRCUIT 




Notes: 1 . Cl = 50pF includes scope probe, wiring and 

stray capacitances without device in test fixture. 

2. Si, S2, S3 are closed during functions tests 
and all AC tests except output enable tests. 

3. Si and S3 are closed while S2 is open for 
tpzH test. S-| and S2 are closed while S3 is 
open for tpzL test. 

4. Cl = TBD for output disable tests. 



KEY TO SWITCHING WAVEFORMS 



WAVEFORM 



m 



DONT CARE: 
ANY CHANGE 

PERMITTED 



S-® 



WILL KE 
CHANGING 
FROM H TO L 



WILL BE 

CHANGING 
FROM L TO H 



CHANGING; 

STATE 

UNKNOWN 



CENTER 
LINf IS HIGH 
IMPEDANCE 
"OFF" STATE 



KS000010 



INPUT/OUTPUT CIRCUIT DIAGRAMS 



Vcc 

l|L 

^ y 


DRIVEN IN 


-■U! 


P 


1 > 






^t 






N 



OUTPUT 



•oh 



-o 



IC000861 



C| =» 5.0 pF, all inputs 



Co'*5.0 pF, all outputs 
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Am29C325 

CMOS 32-Bit Floating-Point Processor 
ADVANCE INFORMATION 



DISTINCTIVE CHARACTERISTICS 



Single VLSI device performs high-speed floating-point 
arithmetic 

- Floating-point addition, subtraction, and multiplication 
in a single clocl< cycle 

- Internal architecture supports sum-of-products, 
Newton-Raphson division 

32-bit, three-bus flow-through architecture 

- Programmable I/O allows interface to 32- and 16-bit 
systems 



IEEE and DEC formats 

- Performs conversions between formats 

- Performs integer ■«-» floating-point conversions 
Input and output registers can be made transparent 
independently 

Pin and functionally compatible with the Bipolar 

Am29325 

The Am29C325 uses less than one-quarter the power of 

the Am29325 

145 PGA requires no heatsink 
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GENERAL DESCRIPTION 



The Am29C325 is a high-speed floating-point processor 
unit. It performs 32-bit single-precision floating-point addi- 
tion, subtraction, and multiplication operations in a single 
VLSI circuit, using the format specified by the proposed 
IEEE floating-point standard, 754. The DEC single-preci- 
sion floating-point format is also supported. Operations for 
conversion between 32-bit integer format and floating-point 
format are available, as are operations for converting 
between the IEEE and DEC , floating-point formats. Any 
operation can be performed in a single clock cycle. Six 
flags — invalid operation, inexact result, zero, not-a-num- 
ber, overflow, and underflow — monitor the status of opera- 
tions. 

The Am29C325 has a three-bus, 32-bit architecture, with 
two input buses and one output bus. This configuration 



provides high I/O bandwidth, allows access to all buses, 
and affords a high degree of flexibility when connecting this 
device in a system. All buses are registered, with each 
register having a clock enable. Input and output registers 
may be made transparent independently. Two other I/O 
configurations, a 32-bit, two-bus architecture and a 1 6-bit, 
three-bus architecture, are user-selectable, easing inter- 
face with a wide variety of systems. Thirty-two-bit Internal 
feedfonward datapaths support accumulation operations, 
including sum-of-products and Newton-Raphson division. 

Fabricated using Advanced Micro Devices' 1.2 micron 
CMOS process, the Am29C325 is powered by a single 5- 
volt supply. The device is housed in a 145-lead pin-grid- 
array package. 



Am29C300 FAMILY HIGH-PERFORMANCE SYSTEM BLOCK DIAGRAM 
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it-Brr 

SEQUENCER 



T 
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CONTROL 
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32-BIT 
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Am29C327 

CMOS Double-Precision Floating-Point Processor 



z\ 



ADVANCE INFORMATION 



DISTINCTIVE CHARACTERISTICS 



High-performance double-precision floating-point pro- 
cessor 

Compreliensive floating-point and integer instruction 
sets 

Single VLSI device performs single-, double-, and 
mixed-precision operations 

Performs conversions between precisions and between 
data formats 
Compatible witfi industry-standard floating-point formats 

- IEEE 754 format 

- DEC F, DEC D, and DEC G formats 

- IBM system/370 format 



Exact IEEE compliance for denormalized numbers with 
no speed penalty 

Eigtit-deep register file for intermediate results and on- 
chip 64-bit data path facilitates compound operations; 
e.g., Newton-Raphson division, sum-of-products, and 
transcendentals 

Supports pipelined or flow-through operation 
Fabricated with Advanced Micro Devices'. 1.2 micron 
CMOS process 
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SIMPLIFIED SYSTEM DIAGRAM 



R-Port 
/f32 



S-Poft 



Operand Router 



Constartts 



mm 



E 



fl-Register 



E 



^ 



S- Register 



A 



\ Reg. File 



J 



w 



ALU Input Multiplexer 



V 



V 



Floating-Poirit & Integer 
ALU 



:^^ 



^ 



F-Register 



/164 



Output Multiplexer 



/ 32 

'r 
F-Port 



DEC F, DEC D, DEC G, anij VAX are trademarks of the Digital Equipment Corporation. 
IBM system/370 is a tradem^ of Jnt^Dationai Business Machirtes, Inc. 
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Publication # Rev. Amendment 

09418 B /O 
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CHAPTER 3 



Bipolar Family 

Am29331 16-Bit Microprogram Sequencer 3-1 

Am29332 32-Bit Arithmetic Logic Unit 3.36 

Am29334 Four-Port Dual-Access Register File 3-74 

Am29434 ECL Four-Port, Dual-Access Register File 3-89 

Am29325 32-Bit Floating-Point Processor* 3.103 

Am29337 1 6-Bit Bounds Checl<er 3-1 04 

Am29338 32-Bit Byte Queue 3-115 



' Front page only of data sheet. See Chapter 4 for complete data sheet. 



16-Bit Microprogram Sequencer 



DISTINCTIVE CHARACTERISTICS 



16-Bits Address up to 64K Words 

Supports 80-90 ns microcycle time for a 32-bit high- 
performance system when used with the other 
members of the Am29300 Family. 
Real-Time Interrupt Support 
Micro-trap and interrupts are handled transparently 
at any microinstruction boundary. 
Built-in Conditional Test Logic 
Has twelve external test inputs, four of which are 
used to internally generate four additional test con- 
ditions. 



Break-Point Logic 

Built-in address comparator allows break-points in 
the microcode for debugging and statistics collection. 
Master/Slave Error Ciiecl(ing 
Two sequencers can operate in parallel as a master 
and a slave. The slave generates a fault flag for 
unequal results. 
33-Levei Stacic 

Provides support for interrupts, loops, and subrou- 
tine nesting. It can be accessed through the D-bus 
to support diagnostics. 

Speed improvement with Am29331A (15% faster 
than Am2g331) 
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GENERAL DESCRIPTION 



The Am29331 is a 16-bit wide, high-speed single-chip 
sequencer designed to control the execution sequence of 
microinstructions stored in the microprogram memory. The 
instruction set is designed to resemble high-level language 
constructs, thereby bringing high-level language program- 
ming to the micro level. 

The Am29331 is interruptible at any microinstruction 
boundary to support real-time interrupts. Interrupts are 
handled transparently to the microprogrammer as an unex- 
pected procedure call. Traps are also handled transparent- 
ly at any microinstruction boundary. This feature allows re- 
execution of the prior microinstruction. Two separate buses 
are provided to bring a branch address directly into the chip 
from two sources to avoid slow turn-on and turn-off times 



for different sources connected to the data-input bus. Four 
sets of multiway inputs are also provided to avoid slow turn- 
on and turn-off times for different branch-address sources. 
This feature allows implementation of table look-up or use 
of external conditions as part of a branch address. The 33- 
deep stack provides the ability to support interrupts, loops, 
and subroutine nesting. The stack can be read through the 
D-bus to support diagnostics or to implement multitasking 
at the micro-architecture level. The master/slave mode 
provides a complete function check capability for the 
device. 

The Am29331 is designed with the IMOx''''^ process which 
allows internal ECL circuits with TTL-compatible I/O. It is 
housed in a 120-lead pin-grid-array package. 



SIMPLIFIED BLOCK DIAGRAM 



0-BUS ApBIS 



0999 09 






16-16, 



TT 



» — ^ 



—<^ CARRY-IN 



IMOX is a trademark of Advanced Micro Devices, Inc.- 
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RELATED AMD PRODUCTS 



Part No. 


Description 


Am29114 


Vectored Priority Interrupt Controller 


Am29116 


High-Performance Bipolar 16-Bit Microprocessor 


Am29C116 


High-Performance CMOS 16-Bit Microprocessor 


Am29PL141 


Field-Programmable Controller 


Am29C323 


CMOS 32-Bit Parallel Multiplier 


Am29325 


32-Bit Floating-Point Processor 


Am29C325 


CMOS 32-Bit Floating-Point Processor 


Am29332 


32-Bit Extended Function ALU 


Am29C332 


CMOS 32-Bit Extended Function ALU 


Am29334 


64x18 Four-Port, Dual-Access Register File 


Am29C334 


CMOS 64x18 Four-Port, Dual-Access Register File 


Am29337 


16-Bit Bounds Checker 


Am29338 


Byte Queue 




T0-T7 D^^ 



Mq Wf Mj M3 Q ^ 



.'^^'^^'^^'^ 



ilULTi-WAV 
MUX 



TEST 
LOGIC 



\- 



TEST 
UUX 



INSTR 
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»,- ,' 



COUNTCR 
hKJX 



>COUHTER 



33X16 
STACK 



I 



D-BUS 
IIUX 



STACK 

MUX 



> SP 



E3 "-p'"-!- 



AOOAESS 
MUX 



HTEnRUPT 

MUX 



ADDRESS 
REOISTER 



> 



> 



COMP 
REG 



6 



Figure 1. Am2933'l Detailed Blocic Diagram 
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CONNECTION DIAGRAM 

(Bottom View) 

PGA* 



A B 



/ — ~~ HI — ~^ \_ 

1 I M0,0 M1,0 M2,0 M2.1 SiN Ml, 2 M1,3 M2,3 GNDT RST INTR SLAVE D15 



DO AS M3,0 M1,1 M0,2 U2,2 M0,3 M3,3 EQUAL OED INTEN HOLD A1S 

VCCT YO D1 M0,1 H3,1 GMJE M3,2 VOCE A-FULL ERROR TnTA Y15 VCKT 

A1 Vl D2 D14 A14 YU 

GNDT A2 y2 D13 A13 GNDT 

A3 D3 SM3E GNDE Dl2 Y13 

Y3 D4 A4 A12 yi2 D11 

D5 Y4 VCCE VCCE Y11 A11 

GNDT AS Y5 010 AlO GNDT 

D6 A6 YS Y10 09 A9 

VCCT 07 T3 T6 GM3E TIO Til 10 VOCE 13 Y9 D8 VCCT 

A7 T1 T2 T5 GNDE T7 SO SI VCCE 12 14 AS Y8 

IS I Y7 TO T9 T4 GIDE T8 CP S3 VCCE II S2 15 FC 



V 



CD010382 



'Pinout observed from pin side of package. 
Key: VCCE = Vcc. ECL 

VCCT = Vcc, TTL 

GNDE= GND, ECL 

GNDT= GND, TTL 
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PIN DESIGNATIONS 
(Sorted by Pin No.) 



PIN NO. 



PIN NAME 



PAD 


PIN 




NO. 


NO. 


PIN W 


99 


C-5 


Yz 


97 


C-6 


GNDE 


39 


C-7 


A4 


37 


C-8 


VCCE 


1 


C-9 


Ys 


120 


c-io 


Yg 


59 


C-11 


T3 


58 


C-12 


T2 


56 


C-13 


T9 


114 


D-1 


M2, 1 


54 


D-2 


M1, 1 


51 


D-3 


Mo, 1 


50 


D-11 


Te 


49 


D-12 


Tg 


47 


D-13 


T4 


106 


E-1 


Cin 


46 


E-2 


Mo, 2 


61 


E-3 


Ms, 1 


60 


E-11 


GNDE 


119 


E-12 


GNDE 


117 


E-13 


GNDE 


116 


F-1 


Ml, 2 


55 


F-2 


M2, 2 


112 


F-3 


GNDE 


111 


F-11 


Tio 


110 


F-12 


T7 


108 


F-13 


Tb 


107 


G-1 


Ml, 3 


45 


G-2 


Mo, 3 


105 


G-3 


Ma, 2 


2 


G-11 


Tii 


62 


G-12 


So 


118 


G-13 


CP 


57 


H-1 


M2, 3 



PAD 
NO. 



PIN 
NO. 



PIN NAME 



PAD 
NO. 



PIN 
NO. 



PIN NAME 



PAD 
NO. 



A-1 


Mo, 


A-2 


Do 


A-3 


VCCT 


A-4 


Al 


A-5 


GNDT 


A-6 


A3 


A-7 


Y3 


A-S 


D5 


A-9 


GNDT 


A-10 


De 


A-11 


VcCT 


A-12 


A? 


A-13 


Y7 


B-1 


Mi, 


B-2 


Ao 


B-3 


Yo 


B-4 


Yi 


B-5 


A2 


B-8 


D3 


B-7 


04 


B-8 


Y4 


B-9 


A5 


B-10 


Ae 


B-11 


D7 


B-1 2 


Tl 


B-13 


To 


C-1 


M2, 


C-2 


M3, 


C-3 


Dl 


C-4 


D2 



115 

113 

52 

53 

109 

48 

44 

104 

41 

4 

63 

3 

102 

43 

103 

5 

65 

64 

98 

98 

98 

6 

66 

8 

100 

42 

101 

9 

67 

7 

40 

36 

96 

69 



H-2 

H-3 

H-11 

H-12 

H-13 

J-1 

J-2 

J-3 

J-11 

J-12 

J-13 

K-1 

K-2 

K-3 

K-11 

K-12 

K-13 

L-1 

L-2 

L-3 

L-4 

L-5 

L-6 

L-7 

L-8 

L-9 

L-10 

L-11 

L-12 

L-13 

M-1 

M-2 

M-3 



Ma, 3 

VCCE 
lo 
S1 

Sa 

GNDT 

EQUAL 

A-FULL 
VccE 
VccE 
VcCE 

RST 

OEd 

ERROR 

l3 

l2 

ll 

INTR 

INTE N 

INTA 

Dl4 

D13 
GNDE 

Al2 
VcCE 

Dig 

Yio 
Yg 

l4 

Sa 

SLAVE 

HOLD 

Yi5 

Ai4 



10 
68 
34 
95 
94 
11 
71 
70 
38 
38 
38 
13 
72 
12 
92 
33 
93 
14 
74 
73 
18 
79 
23 
22 
83 
85 
27 

ee 

32 
35 
75 
15 
77 
78 



M-5 

M-6 

M-7 

M-8 

M-9 

M-10 

M-11 

M-12 

M-13 

N-1 

N-2 

N-3 

N-4 

N-5 

N-6 

N-7 

N-8 

N-9 

N-10 

N-11 

N-12 

N-1 3 



A13 

Di2 

Yi2 

Yii 

Aio 

Dg 

De 

As 

l5 

Dl5 

Al5 

Vcct 

Y14 
GNDT 

Yl3 

Dii 

All 
GNDT 

Ag 

Vcct 
Ya 

FC 



so 

81 
82 
25 
86 
87 
89 
30 
91 
16 
76 
17 
19 
20 
21 
24 
84 
26 
28 
29 
90 
31 
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PIN DESIGNATIONS 
(Sorted by Pin Name) 


PIN NAME 


PIN 
NO. 


PAD 
NO. 


PIN NAME 


PIN 
NO. 


PAD 
NO. 


PIN NAME 


PIN 
NO. 


PAD 
NO. 


PIN NAME 


PIN 
NO. 


PAD 
NO. 


- 


- 


37 


Ds 


M-11 


89 


INTR 


L-1 


14 


T7 


F-12 


42 


- 


- 


39 


D9 


M-10 


87 


Mo, 


A-1 


1 


Tb 


F-1 3 


101 


- 


- 


97 


D10 


L-9 


85 


Mo, 1 


0-3 


3 


T9 


C-1 3 


41 


- 


- 


99 


D11 


N-7 


24 


Mo, 2 


E-2 


65 


T10 


F-11 


100 


A-FULL 


J-3 


70 


D12 


M-6 


81 


Mo, 3 


G-2 


67 


T11 


G-11 


40 


Ao 


B-2 


60 


Dl3 


L-5 


79 


Ml, 


B-1 


61 


VCCE* 


C-8 


53 


A1 


A-4 


58 


Dl4 


L-4 


18 


Mi, 1 


D-2 


63 


VCCE- 


H-3 


68 


A2 


B-5 


116 


Ol5 


N-1 


16 


Ml, 2 


F-1 


6 


VcCE* 


J-11 


38 


A3 


A-6 


114 


EQUAL 


J-2 


71 


Ml, 3 


G-1 


9 


VcCE* 


J-12 


38 


A4 


C-7 


52 


ERROR 


K-3 


12 


M2,0 


C-1 


2 


VcCE* 


J-1 3 


38 


As 


B-9 


110 


FC 


N-1 3 


31 


M2,1 


D-1 


4 


VcCE* 


L-8 


83 


Ae 


B-10 


108 


GNDE 


C-6 


113 


M2, 2 


F-2 


66 


VCCT 


A-3 


■59 


A? 


A-12 


106 


GNDE 


E-11 


98 


M2,3 


H-1 


69 


VCCT 


A-11 


47 


As 


M-12 


30 


GNDE 


E-1 2 


98 


M3, 


C-2 


62 


VcCT 


N-3 


17 


A9 


N-10 


28 


GNDE 


E-1 3 


98 


M3, 1 


E-3 


64 


VcCT 


N-11 


29 


A10 


M-9 


86 


GNDE 


F-3 


8 


M3,2 


G-3 


7 


YO 


B-3 


119 


An 


N-8 


84 


GNDE 


L-6 


23 


M3, 3 


H-2 


10 


Yl 


N.B-4 


117 


Al2 


L-7 


22 


GNDT. 


A-5 


56 


OEd 


K-2 


72 


Y2 


C-5 


115 


Al3 


M-5 


80 


GNDT. 


A-9 


50 


RST 


K-1 


13 


Y3 


A-7 


54 


Ai4 


M-4 


78 


GNDT. 


J-1 


11 


So 


G-1 2 


36 


Y4 


B-8 


111 


Al5 


N-2 


76 


GNDT. 


N-5 


20 


Si 


H-1 2 


95 


Y5 


C-9 


109 


cii; 


E-1 


5 


GNDT. 


N-9 


26 


S2 


L-1 3 


35 


Ye 


C-10 


48 


CP 


G-13 


96 


HOLD 


M-2 


15 


S3 


H-1 3 


94 


Y? 


A-1 3 


46 


Do 


A-2 


120 


lo 


H-11 


34 


SLAVE 


M-1 


75 


Y8 


N-1 2 


90 


D1 


C-3 


118 


h 


K-13 


93 


To 


8-13 


105 


Y9 


L-11 


88 


D2 


C-4 


57 


I2 


K-12 


33 


T1 


B-1 2 


45 


Y10 


L-10 


27 


D3 


B-6 


55 


I3 


K-11 


92 


Tz 


C-12 


104 


Y11 


M-8 


25 


D4 


B-7 


112 


I4 


L-12 


32 


T3 


C-11 


44 


Y12 


M'7 


82 


D5 


A-8 


51 


I5 


M-13 


91 


T4 


D-1 3 


103 


Yl3 


N-6 


21 


D6 


A-10 


49 


INTA 


L-3 


73 


T5 


D-1 2 


43 


Y14 


N-4 


19 


D7 


B-11 


107 


INTEN 


L-2 


74 Te 


D-11 


102 


Y1S 


M-3 


77 


•Single + 5-Volt supply. 
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LOGrC SYMBOL 



METALLIZATION AND PAD LAYOUT 







hW> "'^-a Mi-w "sM 0»-0i» 



WTEN 
WTH 
FITX 



CP 



it 



SLAVE 
ERROR 
EQUAL 




Die Size: 260x245 mil 
Equivalent Gate Count: 2500 



ORDERING INFORMATION 
Standard Products 

AMD Standard products are available in several packages and operating ranges. The order number (Valid Combination) is 
formed by a combination of: a. Device Number 

b. Speed Option (if applicable) 

c. Package Type 

d. Temperature Range 

e. Optional Processing 



-a. DEVICE NUMBER/DESCRIPTION 

Am29331/Am29331A 

16-Bit Microprogram Sequencer 



- e. OPTIONAL PROCESSING 

Blank = Standard processing 
B = Burn-in 

-d. TEMPERATURE RANGE 

C = Commercial (0 to + 85"C) 

-c. PACKAGE TYPE 

G = 120-Lead Pin Grid Array with Heatsink 
(CG 120) 



b. SPEED OPTION 

Not Applicable 



Valid Combinations j 


AM29331 


GC, GCB 


AM29331A 



Valid Combinations 

Valid Combinations list configurations planned to be 
supported In volume for this device, Consult the local AMD 
sales office to confirm availability of specific valid 
combinations, to check on newly released valid combinations, 
and to obtain additional data on AMD's standard military 
grade products. 
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PIN DESCRIPTION 



A0-A15 Alternate Data (Input) 

Input to address multiplexer and counte/. 
A-FULL Almost Full (Bidirectional, Three-State) 

Indicates that 28 < SP < 63 (meaning there are five or less 
empty locations left on stack). Also active during stack- 
under flow. 



Cin Carry In (Input, Active LOW) 

Carry-in to the incrementer. 

CP Clock Pulse (Input) 

Clocks sequencer at the LOW-to-HIGH transition. 

D0-D15 Data (Bidirectional, Three-State) 

Input to address multiplexer, counter, stack, and comparator 

register. Output for stack and stack pointer. 
EQUAL Equal (Bidirectional, Three-State) 

Indicates that the address comparator is enabled and has 

found a match. 
ERROR Error (Output, Active HIGH) 

Indicates a master/slave error in the slave mode. Indicates 

a malfunctioning driver or contention of any output in the 

master mode. 
FC Force Continue (Input, Active HI6H) 

Oven'ides instruction with CONTINUE. 

HOLD Hold (Input, Active HIGH) 

Stops the sequencer and three-states the outputs. 

I0-I5 Instruction (Input) 

Selects one of 64 instructions. 



FUNCTIONAL DESCRIPTION 

Architecture 

The major blocks of the sequencer are the address multiplex- 
er, the address register (AR), the stack (with the top of stack 
denoted TOS), the counter (C), the test multiplexer with logic, 
and the address comparison register (R) (Figure 1). The 
bidirectional D-bus provides branch addresses and iteration 
counts; it also allows access to the stack from the outside. 
The A-bus may be used for map addresses. There are four 
sets of 4-bit multiway branch inputs (M). The bidirectional 
Y-bus either ouputs microprogram addresses or inputs inter- 
rupt addresses. The buses are all 16 bits wide. Figure 1 shows 
a detailed block diagram of the sequencer. 



INTA Interrupt Acknowledge (Bidirectional, Three- 
State, Active LOW) 

Indicates that an interrupt is accepted. 

INTEN Interrupt Enable (Input, Active HIGH) 

Enables inten-upts. 

INTR Interrupt Request (Input, Active HIGH) 

Requests the sequencer to interaipt execution. 

Mo-3, 0-3 Multiway (Input) 

Four sets of multiway inputs providing 16-way branches. 
The first index refers to the set number. 

OEd Output Enable — D-Bus (Input, Active HIGH) 

Enables the D-tius driver, provided that the sequencer is not 
in the hold or slave mode. 



RST Reset (Input, Active LOW) 

Resets the sequencer. 

So -S3 Select (Input) 

Selects one of 16 test conditions. 

SLAVE Slave (Input, Active HIGH) 

Makes the sequencer a slave. 

T0-T11 Test (Input) 

Provides external test inputs. 

Y0-Y15 Address (Bidirectional, Three-State) 

Output of microcode address. Input for intemjpt address. 



Address Multiplexer 

The address multiplexer can select an address from any of 
five sources: 

1) A branch address supplied by the D-bus 

2) A branch address supplied by the A-bus 

3) A multiway-branch address 

4) A return or loop address from the top of stack 

5) The next sequential address from the incrementer 
Multiway-Branch Address 

A multiway-branch address is formed by substituting the lower 
four bits of the address on the D-bus (D3, D2, Di, Do) with one 
of the four sets (IVIox, Mix. Mgx, or Max) of 4-bit multiway- 
branch addresses. The multiway-branch set is selected by the 
number D1D0, while the bits D3 and D2 are "don't cares." 
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15 



Branch 
Address 



M3 Mq 



Multiway Inputs 



Address 
Out 



'15 



Base Address 



1of6 



Table4{M3x) 



Table 3 (Mjx) 



Tatile 2 (M^x) 



Table 1 (Mqx) 



15 



Lookup Table 

BD007460 

Notes: 1. Di and Do select one out of four multiway sets. D3 and D2 are "don't cares." 

2. Each set of M3X - Mqx can select one of sixteen locations. The multiway-branch address is the 
concatenation of D15-D4 (base address) and Mx3-Mxo. 

3. For a given base address, there can be four look-up tables, each sixteen deep, 

Figure 2. Multiway Branch 
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Address Register 

The address register contains the cun'ent address. It is loaded 
from the interrupt multiplexer and feeds the incrementer. The 
incrementer |s inhibited If Qn Is taken HIGH. 

Stack 

A 33-word-deep and 16-blt-wide stack provides first-ln last-out 
storage for return addresses, loop addresses, and counter 
values. Items to be pushed come from the incrementer, the 
interrupt-return-address register, the counter, or the D-bus. 
Items popped go to the address multiplexer, the counter, or 
the D-bus. 

The access to the stack via the D-bus may be used for context 
switching, stack extension, or diagnostics. As the stack Is only 
accessible from the top, stack extension Is done by temporari- 
ly storing the whole or some lower part of the stack outside the 
sequencer. The save and the later restore are done with pop 
and push operations, respectively, at balanced points in the 
microprogram; for example, points with the same stack depth. 
The internal D-bus driver must be turned on when popping an 
item to the D-bus; if the driver is off, the Item will be unstacked 
instead. The driver Is normally turned on when the Output 
Enable signal Is asserted and the sequencer Is not being reset 
(0Ed = 1, RST = 1). 

The stack pointer is a modulo 64 counter, which Is increment- 
ed on each push and decremented on each pop. The stack 
pointer is reset to zero when the sequencer is reset, but the 
pointer may also be reset by instruction. Thus, the stack 
pointer Indicates the number of items on the stack as long as 
stack overflow or underflow has not occurred. Overflow 
happens when an Item Is pushed onto a full stack, whereby 
the item at the bottom of the stack Is ovenwritten. Underflow 
happens when an Item Is popped from an empty stack; In this 
case the item Is undefined. 

The contents of the stack pointer are present on the D-bus for 
all Instructions except POP D, provided the driver is turned on. 
The output signal, A-FULL, is active under the following 
conditions: 28<SP<63. 

Counter 

The counter may be used as a loop counter. It may be loaded 
from the D-bus, the A-bus, or via a pop from the stack. Its 
contents may also tie pushed onto the stack. 

A normal for-loop is set up by a FOR Instruction, which loads 
the counter from the D- or A-bus with the desired number of 
iterations; the Instruction also pushes onto the stack a loop 
address that points to the next sequential Instruction. The end 
of the loop Is given by an unconditional END FOR instruction, 
which tests the counter value against the value one and then 
decrements the counter. If the values differ, the loop is 
repeated by selecting the address at the stack as the next 
address. If the values are equal, the loop is terminated by 
popping the stack, thereby removing the loop address, and 
selecting the address from the Incrementer as the next 
address. The number of iterations Is a 16-blt unsigned number, 
except that the number zero corresponds to 65,536 Iterations. 
By pushing and popping counter values It is possible to handle 
nested loops. 

Address Comparison 

The sequencer Is able to compare the address from the 
inten^pt multiplexer with the contents of the comparator 
register. The Instruction SET loads the comparator register 
with the address on the D-bus and enables the comparison, 
while CLEAR disables it. The comparison is disabled at reset. 
A HIGH is present at the output EQUAL If the comparison Is 
enabled and the two addresses are equal. The comparison is 



useful for detection of a break point or counting the number of 
times a mioroinst'uction at a specific address is executed. 

Instruction Set 

The sequencer has 64 instructions that are divided Into four 
classes of 16 Instructions each. The instruction lines I0-I5 
use I5 and I4 to select a class, and I0-I3 to select an 
instruction within a class. The classes are: 
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1 
1 



Classes 

Conditional sequence control, 

Conditional sequence control with inverted 

polarity. 

Unconditional sequence control, and 

Special function with implicit continue. 



Note that for the first three classes I5 forces the condition to 
be true and I4 inverts the condition. The basic instructions of 
the first three classes are shown In Table 1 and the Instruc- 
tions of the fourth class In Table 2. 

Structured microprogramming Is supported by sequencer 
instructions that singly or In pairs correspond to high-level 
language control constructs. Examples are FOR I; = D DOWN 
TO 1 DO . . . END FOR and CASE N OF . . . END CASE. The 
instructions have been given high-level language names 
where appropriate. Figure 3 shows how to microprogram 
important control constructs; the high-level language Is on the 
left and the microcode on the right. 

Test Conditions 

The condition for a conditional instruction is supplied by a test 
multiplexer, which selects one out of sixteen tests with the 
select lines So - S3. Twelve of these are supplied directly by 
the Inputs To - Ti 1 , while the remaining four tests are generat- 
ed by the test logic from the Inputs Te - Ti 1 . The following 
table shows the assignments. 



(So-S3)H 


Test 


Intended Use 


0-7 


T0-T7 


General 


8 


Ts 


G (Carry) 


9 


Tg 


N (Negative) 


A 


T10 


V (Overflow) 


B 


Til 


Z (Zero or equal) 


C 


T8 + T11 


C + Z (Unsigned less 
than or equal, borrow 
mode) 


D 


T8 + T11 


C -H Z (Unsigned less 
than or equal) 


E 


T9©Tio 


NffiV (Signed less than) 


F 


(T9®Tio) + Tii 


(N®V)-HZ (Signed less 
than or equal) 



Force Continue 

The sequencer has a force continue (FC) Input, which over- 
rides the Instruction inputs lo - 15 with a CONTINUE instoic- 
tlon. This makes it possible to share the microinstruction field 
for the Sequencer Instruction with some other control or to 
Initialize a writable control store. 

Reset 

in order to start a microprogram properly, the sequencer must 
be reset. The reset works like an instruction overriding both 
the instruction input and the force continue input. The reset 
selects the address at the address multiplexer, forces the 
EQUAL output to LOW, and disregards a potential interrupt 
request. It synchronously disables the address comparison 
and initializes the stack pointer to 0. The contents of the stack 
are invalid after a reset 
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TABLE 1. INSTRUCTION SET for I5I4 = 


00, 01, 10 










Cond.: Fall 


Cond.: Pass 








l5-lo 


Instruction 


Y 


Stack 


Y Stack 


Counter 


Comp. 


D-Mux 


00, 10, 20 


Goto D 


INC 


_ 


D 


_ 


_ 


_ 


SP 


01, 11, 21 


Call D 


INC 


- 


D 


Push INC 


_ 


_ 


SP 


02, 12, 22 


Exit D 


INC 


_ 


D 


Pop 


_ 


_ 


SP 


03, 13, 23 


End for D, C =it 1 


INC 


_ 


D 




C-^C-1 


_ 


SP 




End for D, C = 1 


INC 


- 


INC 


- 


C-^C-1 


_ 


SP 


04, 14, 24 


Goto A 


INC 


_ 


A 


_ 


_ 


_ 


SP 


05, 15, 25 


Call A 


INC 


_ 


A 


Push INC 


_ 


_ 


SP 


06, 16, 26 


Exit A 


INC 


- 


A 


Pop 


- 


_ 


SP 


07, 17, 27 


End for A, C ^ 1 


INC 


_ 


A 




C-i-C-1 


_ 


SP 




End for A, C = 1 


INC 


- 


INC 


- 


C-C-l 


- 


SP 


08, 18, 28 


Goto M 


INC 


- 


D:M 


_ 


_ 


- 


SP 


09, 19, 29 


Call M 


INC 


- 


D:M 


Push INC 


- 




SP 


OA, 1A, 2A 


Exit M 


INC 


- 


D:M 


Pop 


- 




SP 


OB, 1B, 2B 


End for M, C =it 1 


INC 


- 


D;IV1 




C<-C-1 




SP 




End for M, C = 1 


INC 


_ 


INC 


_ 


Ci-C-I 


_ 


SP 


OC, 1C, 2C 


End Loop 


INC 


Pop 


TOS 


- 


_ 


_ 


SP 


OD, ID, 2D 


Call Coroutine 


INC 




TOS 


Pops 
Push INC 


- 


- 


SP 


OE, IE, 2E 


Return 


INC 


_ 


TOS 


Pop 


_ 


_ 


SP 


OF, IF, 2F 


End for, C + 1 


INC 


Pop 


TOS 




C-^C-1 


_ 


SP 




End for, C = 1 


INC 


Pop 


INC 


Pop 


C-^C-1 


- 


SP 



Cond. =(Test [s] OR I5) XOR I4 

= Concatination 
C = Counter 
INC = Output of Incrementer = AR + 1 (if Cili = LOW) 

Note: For unconditional instructions, the action marked under Cond.rPass is taken. 





TABLE 2. INSTRUCTION SET for I5I4 = 


= 11 




I5-I0 


Instruction 


Y 


Stack 


Counter 


Comp. 


D-Mux 


30 


Continue 


INC 


_ 


_ 


_ 


SP 


31 


For D 


INC 


Push INC 


C<-D 


_ 


SP 


32 


Decrement 


INC 


_ 


Ci-C-l 


_ 


SP 


33 


Loop 


INC 


Push INC 


_ 


_ 


SP 


34 


Pop D 


INC 


Pop 


- 


- 


TOS 


35 


Push D 


INC 


Push D 


- 


- 


SP 


36 


Reset SP 


INC 


SP-^O 


- 


- 


SP 


37 


For A 


INC 


Push INC 


C-A 


_ 


SP 


38 


Pop C 


INC 


Pop 


C-^-TOS 


- 


SP 


39 


Push C 


INC 


Push C 


- 


- 


SP 


3A 


Swap 


INC 


TOS«-C 


C-«-TOS 


- 


SP 


3B 


Push C Load D 


INC 


Push C 


C-^D 


_ 


SP 


3C 


Load D 


INC 


_ 


C-D 


_ 


SP 


3D 


Load A 


INC 


- 


C*-A 


- 


SP 


3E 


Set 


INC 


_ 


- 


R-i-D, Enable 


SP 


3F 


Clear 


INC 


- 


- 


Disable 


SP 



R = Comp. Register 
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Interrupts 

The sequencer may be interrupted at the completion of the 
current microcycle by asserting the interrupt request input 
INTR. The return address of the interrupted routine is saved 
on the stack so that nested interrupts can be easily Imple- 
mented. An Intemjpt Is accepted if Inten'upts are enabled and 
the sequencer Is not being reset or held (INTEN = HIGH, 
RST = HIGH, and HOLD = LOW). The lnteniipt-acl<nowiedge 
output (iNTA) goes LOW when an interrupt is accepted. 

When there Is no interrupt, addresses go from the address 
multiplexer to the Y-bus via the driver, and to the address 
register and the comparator via the Interrupt multiplexer. When 
there Is an Intenupt, the driver of the sequencer Is turned off, 
an external driver Is turned on, and the interrupt multiplexer Is 
switched. The interrupt address Is supplied via the external 
driver to the Y-bus, the address register, and the comparator 
(Figure 4). In order to save the address from the address 
multiplexer, the address is stored in the interrupt return 
address register, which for simplicity Is clocked every cycle. 
The next microlnstmctlon Is the first miaolnstruction of the 
interrupt routine (Figure 5). 

In this cycle the address In the intenupt return address register 
Is automatically pushed onto the stack. Therefore the microin- 
struction In this cycle must not use the stack; If a stack 
operation Is programmed, the result is undefined. The instnjc- 
tlons that do not use the stack are GOTO D, GOTO A, GOTO 
M, CONTINUE, DECREMENT, LOAD D, LOAD A, SET and 
CLEAR. A RETURN instruction terminates the Inten^pt routine 
and the Interrupted routine is resumed, interrupts only work 
with a single-level control path. 

Traps 

A trap Is an unexpected situation linked to current microin- 
struction that must be handled before the microinstruction 
completes and changes the state of the system. An example 
of such a situation is an attempt to read a word from memory 
across a word boundary in a single cycle. When a trap occurs, 
the current microinstruction must be aborted and re-executed 
after the execution of a trap routine, which in the meantime will 
take con'ectlve measures. An Intenrupt, on the other hand. Is 
not linked directly to the cun-ent microinstruction that can 
complete safely before an interrupt routine Is executed. 

Execution of a trap requires that the sequencer Ignore the 
current microinstruction, select the trap return address at the 
address multiplexer, and initiate an interrupt. This will save the 
trap return address on the stack and Issue the trap address 
from an external source (Figure 6). The address register 



contains the address of the microinstruction In the pipeline 
register, thus the address register already contains the trap 
return address when a trap occurs. This address can be 
selected by the address multiplexer by disabling the incremen- 
ter (C|i\i = 1), and using the force continue mode (FC = 1). In 
this mode the sequencer ignores the current microinstruction. 
The remaining part of the trap handling is done by the Interrupt 
(Figure 7), thus the section on Interrupts also applies to traps. 
There is one exception, however. The Interrupt enable cannot 
be used as a trap enable as it does not control the force 
continue mode and the carry-In to the incrementer. 

Hold Mode 

The sequencer has a hold mode In which the operation is 
suspended. 



When the HOLD signal goes active, the outputs (Y, INTA, 
A-FULL & EQUAL) are disabled and the sequencer enters the 
hold mode after the current cycle. While the sequencer Is in 
this mode, the internal state Is left unchanged and the D-bus Is 
disab led. When the HOLD signal goes Inactive, the outputs (Y, 
INTA, A-FULL & EQUAL) are enabled again and the sequencer 
leaves the hold mode after the cycle. 

In a time-multiplexed multlmlcroprocess system there may be 
one sequencer for all processes with microprogrammed con- 
text save and restore, or there may be one sequencer per 
microprocess permitting fast process switch. In the latter case 
the Y-buses of the sequencers are tied together and connect- 
ed to a single microprogram store. A control unit decides on a 
cycle-by-cycle basis what sequencer should be running, and 
activates the HOLD signal to the remaining sequencers. The 
hold mode has higher priority than Interrupts, and works 
Independently of the reset. The hold mode can only be used 
with a single-level control path. 

Master/Slave Configuration 

In some systems reliability Is very important. The master/slave 
configuration that consists of two sequencers operated in 
parallel is able to detect faults in both the Interconnect and the 
internal function of the sequencers. One sequencer Is the 
master and operates normally. The other Is the slave, i.e., all 
outputs except the signal ERROR are turned Into Inputs and 
connected to the outputs of the master. Since the slave Is 
operated In parallel with the master, it can compare Its result 
with the result of the master and signal an error If they differ. 
The error signal from the master Indicates a malfunctioning 
driver or contention. Because a TTL output goes HIGH when 
power Is missing, the ERROR signal also indicates power 
failure. 
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High-Level Language Constructs 






An example of high-level language constructs using Am2933 


instructions is given in 


Figure 3 (3-1, 3-2, 3-3, and 3-4). 


REPEAT LOOP 


FOR CNT: = 10 DOWN TO 1 DO FOR D 10 


UNTIL CC END LOOP NOT CC 


END FOR 


END FOR 


WHILE CC DO LOOP 


Figure 3-2. Lx>op with Known Number of 


IF NOT CC THEN EXIT L 


Iterations 


END WHILE END LOOP 
L: 






LOOP LOOP 






IF CC THEN EXIT IF CC THEN EXIT L 






END LOOP END LOOP 
L: 






Figure 3-1. Loops with Unknown Number 






of Iterations 






PUSH D B 




PUSH D C 


CASE 1 OF GOTO M 


IF X THEN 


IF NOT X THEN GOTO A 


0: - A: - 


IF Y THEN 


IF NOT Y THEN GOTO B 


-, RETURN (TO B) 


- 


- 


1:- A + 2: - 


- 


-, RETURN (TO C) 


-, RETURN (TO B) 


ELSE 


B: 


2: - A + 4: - 


_ 


_ 


- RETURN (TO B) 


- 


-, RETURN (TO C) 


3: - A + 6: - 


END IF 




■ - RETURN 


ELSE 


A: 


END CASE B: 


IF Z THEN 


IF NOT Z THEN GOTO D 




_ 


-, RETURN (TO D) 


Figure 3-3. Case Statement 


ELSE 


D: 


(with = Ai5 . . . A4XXOO and 


- 


- 


Mo, o-3 = A3iiioO during the 


- 


-, RETURN (TO C) 


GOTO M Instruction. AiAq must 


END IF 




be 00, and X signifies a don't 


END IF 


C: 


care.) 








Figure 3-4. Double-Nested If Statement 
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WHto exacutkig the Inn. >t A, the laq. It 
knmpted and dractad to B. 



E»culinB*t*' 



SMCfc 



A : ContiniM 
A-rl: ... 

8 : Continue 
8f1: ... 



ill 



J L 



Mux 


Int. B«. 
'AtfOr, B«9. 



J Afl 



Hus 



Mt. 

Heg. 
a 



On 



Figure 4. Ani29331 Interrupt Cycle 1 



A trap occure at tn« insl. A. and tKe s«). i» 
diractad to B. 



EncutingalA. 



A : Intiruetion Trappad By FC = 1. 

^ = 1. INTR = 1 
A»l; ... 

B : Continue I I I I 

B»l: ... f— " 



.JL 



Adtt. Reg. 



T 



J A 



Adti-. 

Res- 

& 

;»lncrem. 



t 



D 



.-^. 



Figure 6. Am29331 Traps Cycle 1 



ExMuting M B. 



A-t-l 
Stack 



A+1 



Mux 



ill! 



J!L 



e-ri 



Mux 



. Int. R*t. 
Addr. fl*g. 



On 



Mux 



Add'. 

Reg. 

a 

>lnorem. 



, B*^ 



¥ 
Off I 

-H 

Figure 5. Am29331 Interrupt Cycle 2 
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«^„, r..^^,»™. 



I 



i - 


^ int. Bet 1 
^Addr.Rag. 



♦ 



,.i-te 

1 



MA. 

Res- 
a 

Hnorem. 



i •+> 



-t>- 



Figure 7. Am29331 Traps Cycle 2 
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Instruction Set Definition 



Legend: • = Other instruction 

O = Instruction being descritied 
CC = (Test [S3 -Sol) 



Opcode 
Cs-lQ) 



Mnemonics 



20h 

24h 
28h 



BRA D 



BRA A 



BRA_M 



2Ch 



BRA S 



OOh 



04h 



OBh 



BRCC D 



BRCC M 



OCh 



BRCC S 



P = Test pass 

F = Test fail 

o = Register in part 



Note: Opcode numbers are in hexadecimal notation. 



Description 



GOTO D 

Unconditional branch to tiie address specified 
by tile D inputs. The D port must be disabled to 
avoid bus contention. 

GOTO A 

Unconditional branch to the address specified 

by the A inputs. 

GOTO Multiway (D15-D4 1^X3-1*^X0) 
Unconditional branch to tfie address specified 
by the M inputs concatenated with the D input. 
The lower four bits on the D bus (D3 - Dp) are 
replaced by one of the four sets of the four-bit 
multiway branch addresses. The multiway 
branch set is selected by bits Di and Do while 
bits D3 and Dg are "don't cares." 

GOTO TOS 

Unconditional branch to the address on the top 

of the stack. 



IF CC THEN GOTO D 

ELSE CONTINUE 

If CC is HIGH (pass), branch to the address 

specified by D. If CC is LOW (fail), continue. 

The D port must be disabled to avoid bus 

contention. 

IF CC THEN GOTO A 

ELSE COf^TINUE 

If CC is HIGH (pass), branch to the address 

specified by A. If CC is LOW (fail), continue. 

IF CC THEN GOTO Multiway 
(D15-D4 Mx3-Mxo) 
ELSE CONTINUE 

If CC is HIGH (pass), branch to the address 
specified by D inputs concatenated with the M 
inputs. If CC is LOW (fail) continue. The lower 
four bits on the D bus (03 - Do) are replaced by 
one of the four sets of the 4-bit multiway 
tjranch addresses. The multiway branch set is 
selected by bits D^ and Do while bits D3 and D2 
are "don't cares." 

IF DC THEN GOTO TOS 

ELSE 

POP STACK 

CONTINUE 

If CC is HIGH (pass), branch to the address on 

the top of the stadi. If CC is LOW (fail), pop the 

5tacl( and continue. 



Execution Example 




PF001730 
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Opcode 
(I5-I0) 

10h 



14h 



Mnemonics 



BRNC_D 



18h 



BRNC_M 



Description 



IF NOT CC THEN GOTO D 

ELSE CONTINUE 

If CC is LOW (pass), branch to the address 

specified by D. If CC is HIGH (fail), continue. 

The D Port must be disabled to avoid Bus 

contention. 

IF NOT CC THEN GOTO A 

ELSE CONTINUE 

If CC is LOW (pass), branch to the address 

specified by A. If CC is HIGH (fal), continue. 

IF NOT CC THEN GOTO Multiway 
(D15-D4 Mx3-Mxo) 
ELSE CONTINUE 

If CC is LOW (pass), branch to the address 
specified by D inputs concatenated with the M 
inputs. If CC is HIGH (fail), continue. The lower 
tour bits on the D bus (D3 - Do) are replaced by 
one of the four sets of the 4-bit multiway 
branch addresses. The multiway branch set is 
selected by bits Di and Do while bits D3 and Dg 
are "don't cares." 



Execution Example 




1Ch 



BRNC_S 



21 H 



25h 



29h 



IF NOT CC THEN GOTO TOS 

ELSE 

POP STACK 

CONTINUE 

If CC is LOW (pass), branch to the address on 

the top of the stack. If CC is HIGH (fail), pop the 

stack and continue. 



CALL D 

Unconditional branch to the subroutine 

specified by the D inputs. Push the return 

address (address Reg. + 1) on the stack. The 

D port must be disabled to avoid bus 

contention. 

CALL A 

Unconditional branch to the subroutine 
specified by the A inputs. Push the return 
address (Address Reg. + 1) on the stack. 

CALL Multiway (D15-D4 Mx3 - Mxo) 
Unconditional branch to the subroutine 
specified by the D inputs concatenated with the 
multiway inputs. Push the return address 
(Address Reg. + 1) on the stack. The lower 
four bits on the D bus (D3 - Do) are replaced by 
one of the four sets of the 4-bit multiway 
branch addresses. The multiway branch set is 
selected by bits D^ and Dq while bits D3 and D2 
are "don't cares." 



50 II 

STACK 

51 I I O— — '^ ■• 

52(gf— •«! 

53 • < I 91 

54 I I tI 92 



2Dh 



CALL TOS 

Unconditional branch to the subroutine 
specified by the address on the top of the 
stack. The stack is popped and the return 
address (Address Reg. -H) is then pushed 
onto the stack.- 



Note: Opcode numiDers are in hexadecimal notation. 
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Opcode 
(I5-I0) 

OIh 



05h 



Mnemonics 



09h 



CCC D 



CCC A 



CCC M 



Description 



IF CC, THEN CALL D 

ELSE CONTINUE 

If CC is HIGH (pass), call the subroutine 

specified by the D inputs. Push the return 

address (Address Reg. + 1) on the stack. If CC 

is LOW (fail), continue. The D port must be 

disabled to avoid bus coi^ention. 

IF CC, THEN CALL A 

ELSE CONTINUE 

If CC is HIGH (pass), call the subroutine 

specified by the A inputs. Push the return 

address (Address Reg. + 1) on the stacl(. If CC 

is LOW (fail), continue. 

IF CC, THEN CALL Multiway 
(0,5 -D4 Mx3-Mxo) 
ELSE CONTINUE 

If CC Is HIGH (pass), call the subroutine 
specified by the inputs concatenated with the 
M inputs. Push the return address (Address 
Reg. + 1) on the stack. The lower four bits on 
the D bus (D3 - Do) are replaced by one of the 
four sets of the 4-bit multiway branch 
addresses. The multiway branch set is selected 
by bits D) and Dq while bits D3 and Dj are 
"don't cares." 



Execution Exampie 



90 II 
51 I I 



STACK 
53(WF j3-»— PC< 



N91 
92 



S« * 



ODh 



CCC s 



11H 



15h 



19h 



CNC 



CNC A 



CNC M 



IF CC, THEN CALL TOS 

ELSE CONTINUE 

If CC is HIGH (pass), call the subroutine 

specified by the address on the top of the 

stack. The stack is popped and the return 

address (Address Reg. + 1) is pushed onto the 

stack. If CC is LOW (fail), continue. 



IF NOT CC, THEN CALL D 

ELSE CONTINUE 

If CC is LOW (pass), call the subroutine 

specified by the D inputs. Push the return 

address (Address Reg. + 1) on the stack. If CC 

is HIGH (fail), continue. The D port must be 

disabled to avoid bus contention. 

IF NOT CC, THEN CALL A 

ELSE CONTINUE 

If CC is LOW (pass), call the subroutine 

specified by the A inputs. Push the retunn 

address (Address Reg. + 1) on the stack. If CC 

is HIGH (fail), continue. 

IF NOT CC, THEN CALL l^uitiway 
(D15-D4 Mx3-Mxo) 
ELSE CONTINUE 

If CC is LOW ftiass), call the subroutine 
specified by the D inputs concatenated with the 
M inputs. Push the return address (Address 
Reg. + 1) on the stack. The lower four l)its on 
the D bus (D3 - Do) are replaced by one of tfie 
four sets of the 4-bit multiway branch 
addresses. The multiway branch set is selected 
by tjjts D^ and Do while bits D3 and D2 are 
"don't cares." 



so 

SI I I 



STACK 

I'' " 
S3(§ ) ■ • W 



54 • I I 91 

55 ( I • 9S 



1Dh 



IF NOT CC, THEN CALL TOS 

ELSE CONTINUE 

If CC is LOW (pass), call the subroutine 

specified by the address on the top of the 

stack. The stack is popped and the return 

address (Address Reg. -i- 1) is pushed onto the 

stack. 



Note: Opcode numbers are in hexadecimal notation. 
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Opcode 
(is - io) 



MRsmoRscs 



Description 



Execution Example 



Z2h 

26h 
2Ah 



EXITD 



EXIT_A 



2Eh 



02h 



06h 



OAh 



XTCC D 



XTCCA 



XTCC_M 



EXIT TO D 

Unconditional branch to the address specified 
by the D inputs and pop the stack. The D port 
must be disabled to avoid bus contention. 

EXIT TO A 

Unconditional branch to the address specified 

by the A inputs and pop the stack. 

EXIT TO Multiway (D15-D4 Mx3-Mxo) 
Unconditional branch to the address specified 
Ijy the D inputs concatenated with the M inputs 
and pop the stack. The lower four bits on the D 
bus {D3 - Do) are replaced by one of ttie four 
sets of the 4-bit multiway txanch addresses. 
The multiway branch set is selected by tits D-\ 
and Do while D3 and D2 are "don't cares." 

EXIT TO TOS 

Unconditional branch to the address on the top 
of the stack and pop the stack. Also used for 
unconditional returns. 



IF CC, THEN EXIT TO D 

ELSE CONTINUE 

If CC is HIGH (pass), exit to the address 

specified by the D inputs and pop the stack. If 

CC is LOW (fail), continue with no pop. The D 

port must be disabled to avoid bus contention. 

IF CC, THEN EXIT TO A 

ELSE CONTINUE 

If CC is HIGH (pass), exit to the address 

specified by the A inputs and pop the stack. If 

CC is LOW (fail), continue with no pop. 

IF CC, THEN EXIT TO Multiway 
(D15-D4 Mx3-Mxo) 
ELSE CONTINUE 

If GC is HIGH (pass), exit to the address 
specified by the D inpute concatenated with the 
M inputs and pop the stack. The lower four bits 
on the D bus (D3- Do) ate replaced by one of 
the four sets of the 4-bit multiway branch 
addresses. The multiway branch set is selected 
by bits Di and Do while bits D3 and Da are 
"don't cares." 



STACK 1 

• 92 



PF001790 



STACK 
' 51 



STACK y 



Kll 



OEh 



XTCC_S 



IF CC, THEN EXIT TO TOS 

ELSE CONTINUE 

If CC is HIGH (pass), exit to the address on the 

top of the stack and pop the stack. If CC is 

LOW (fail), continue with no pop. Also used for 

conditional returns. 



Note: Opcode numbers are in hexadecimal notation. 
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Opcode 
(I5-I0) 

12h 



16h 



1Ah 



Mnemonics 



XTNC D 



IEh 



23h 



DJMP 



27h 



DJMP A 



2Bh 



Description 



2Fh 



IF NOT CC, THEN EXIT TO D 
ELSE CONTINUE 

It CC is LOW (pass), exit to the address 
specified by the D inputs and pop the stack. If 
CC is HIGH (fail), continue with no pop. The D 
port must be disabled to avoid bus contention. 

IF NOT CC, THEN EXIT TO A 
ELSE CONTINUE 

If CC is LOW (pass), exit to the address 
specined by the A inputs and pop the stack. If 
CC is HIGH (fail), continue with no pop. 

IF NOT CC, THEN EXIT TO Multiway 
(D15-D4 MX3 - MXO) 
ELSE CONTINUE 

If CC is LOW (pass), exit to the address 
specified by the D inputs concatenated with the 
M inputs and pop the stack. The lower four bits 
on the D bus (Ds - Do) are replaced by one of 
the lour sets of the 4-bit multiply branch 
addresses. The multiway branch set is selected 
by bits Di and Do while bits D3 and D2 are 
"don't cares." 

IF NOT CC, THEN EXIT TO TOS 
ELSE CONTINUE 

H CC is LOW (pass), exit to the address on the 
top of the stack and pop the stack. If CC is 
HIGH (fail), continue with no pop. Also used for 
conditional returns. 



IF CNT^I THEN CNT:-CNT-1 

GOTO D 

ELSE CNT;"CNT-1 

CONTINUE 

If the counter is not equal to one, decrement 

the counter and branch to the address 

specified by the D inputs. II the counter is equal 

to one, then decrement the counter and 

continue. The D port must be disabled to avoid 

bus contention. 

IF CNT*1 THEN CNTl-CNT-t 

GOTO A 

ELSE CNT; » CNT - 1 

CONTINUE 

If the counter is not equal to one, decrement 

the counter and branch to the address 

specified by the A inputs. If the counter is equal 

to one, then decrement the counter and 

continue. 

IF CNT*1 THEN CNT: - CNT -1 
GOTO Multiway {D15-D4 Mx3 - Mxo) 
ELSE CNT: - CNT- 1 
CONTINUE 

II the counter is not equal to one, decrement 
the counter and branch to the address 
specified by the D inputs concatenated with the 
M inputs. The lower four bits on the D bus 
(D3 - Do) are replaced by one of the four sets 
o( the 4-bit multiway branch addresses. The 
multiway branch set is selected by bits Di and 
Do while bits 03 and Dj are "don't cares." 

IF CNT#1 THEN CNT:- CNT -1 

GOTO TOS 

ELSE CNT: = CNT - 1 

POP STACK 

CONTINUE 

If the counter is not equal to one, decrement 

the counter and branch to the address on the 

top of the stack. It the counter is equal to one, 

then decrement the counter, pop the stack and 

continue. 



Execution Example 




STACK 
0— PC^I 



. PF001810 



. COUNTER > I 



54 • COUMTER = 1 



COOHTER 



PF001820 



Note: Opcode numbers are in hexadecimal notation. 
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Opcode 
(is-'o) 

03h 



Mnemonics 



DJCX;_D 



07h 



DJCC_A 



Description 



IF CC AND CNT * 1 THEN CNT: - CNT - 1 

GOTO D 

ELSE CNT: = CNT - 1 

CONTINUE 

If CC is HIGH (pass) and the counter Is not 

equal to one, decrement the counter and 

branch to the address specified by the D 

inputs. If CC is LOW (fail) br the counter is 

equal to one, then decrement the counter and 

continue. The D port must be disabled to avoid 

bus contention. 

IF CC AND CNT#1 THEN CNT: = CNT-1 

GOTO A 

ELSE CNT: -CNT-1 

CONTINUE 

If CC is HIGH (pass) and the counter is not 

equal to one, decrement the counter and 

branch to the address specified by the A inputs. 

If CC is LOW (fail) or the counter is equal to 

one, then decrement the counter and continue. 



Execution Exampie 



Mil 

" T 

52 

S3 



S4 



^ 



PANO 
COUNTER f 1 



COUNTER 
"■0-»— COUNT- 



COUNTER 3 1 



OBh 



DJCC_M 



OFh 



DJCC_S 



IF CC AND CNT#1 THEN CNT: - CNT-1 

GOTO Multiway (D15-D4 Mx3 - Mxo) 

ELSE CNT: -CNT-1 

CONTINUE 

If CC is HIGH (pass) and the counter Is not 

equal to one, decrement the counter and 

branch to the address specified by the D inputs 

concatenated with the M inputs. The lower four 

bits on the D bus (D3 - Do) are replaced by one 

of the four sets of the 4-bit multiway branch 

addresses. The multiway branch set is selected 

by bits Di and Do while bits D3 and D; are 

"don't cares." 

IF CC AND CNT#1 THEN CNT:- CNT-1 

GOTO TOS 

ELSE CNT; = CNT-1 

POP STACK 

CONTINUE 

If CC is HIGH (pass) and the counter is not 

equal to one, decrement the counter and 

branch to the address on the top of the stack. If 

CC is LOW (fail) or the counter is equal to one, 

then decrement the counter, pop the stack and 

continue. 



Note: Opcode numbers are in hexadecimal notation. 
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Opcode 
Cs-'o) 

13h 



Mnemonics 



17h 



IBh 



1Fh 



DJNCC_S 



Description 



IF NOT CC AND CNT =^ 1 THEN 

CNT: -CNT-1 

GOTO D 

ELSE CNT: -CNT-1 

CONTINUE 

If CC is LOW (pass) and the counter is not 

equal to one, decrement the counter and 

branch to the address specified by the D 

inputs. If CC is HIGH (fail) or the counter is 

equal to one, then decrement the counter and 

continue. The D port must be disabled to avoid 

bus contention. 

IF NOT CC AND CNT^t THEN 

CNT: -CNT-1 

GOTO A 

ELSE CNT: -CNT-1 

CONTINUE 

If CC is LOW (pass) and the counter Is not 

equal to one, decrement the counter and 

branch to the address specified by the A inputs. 

The content of the interrupt return address 

register and the address register is replaced by 

the A address in this case. If CC is HIGH (fail) 

or the counter is equal to one, the current 

address is incremented, appears on the bus for 

continue, and is stored Into the above two 

registers. 

IF NOT CC AND CNT=it1 THEN 
CNT: -CNT-1 

GOTO Multlway (Dis - D4 Mg - Mq) 
ELSE CONTINUE 

If CC is LOW (pass) and the counter is not 
equal to one, decrement the counter and 
branch to the address specified by the D inputs 
concatenated with the M inputs. The lower four 
bits on the bus (D3 - Do) are replaced by or>e 
of the four sets of the 4-bit multiway branch 
addresses. The multiway branch set is selected 
by bits Di and Do while bits Ds and D2 are 
"don't cares." 

IF NOT CC AND CNT ¥= 1 THEN 

CNT: -CNT-1 

GOTO TOS 

ELSE CNT: -CNT-1 

POP STACK 

CONTINUE 

If CC is LOW (pass) and the counter Is not 

equal to one, decrement the counter and 

branch to the address on the top of the stack. If 

CC is HIGH (fail) or tfie counter is equal to one, 

than decrement the counter, pop the stack and 

continue. 



Execution Example 



»®- 



PANO 
eOOKTER * 1 



5S 



COUNTER 
--^y-'— COUHT- 



FOR 
COUNTER s 1 



PF001840 



2Eh 

OEh 

1Eh 



RETCC 



RETURN 

Unconditional return from subroutine. The 

return address is popped from the stack, 

IF CC THEN RETURN 

ELSE CONTINUE 

If CC is HIGH (pass), return from subroutine. 

The return address is popped from the stack. If 

CC Is LOW (fail), continue. 

IF NOT CC THEN RETURN 

ELSE CONTINUE 

if CC is LOW (pass), return from subroutine. 

The return address is popped from the stack. If 

CC is HIGH (fail), continue. 




Note: Opcode numbers are in hexadecimal notation. 



PF001860 
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Opcode 
(!5-io) 

3lH 



37h 



Mnemonics 



FOR D 



33h 



LOOP 



Description 



INITIALIZE LOOP 

Push the Address Reg. + 1 on the stack, load 

the counter from the D inputs and continue. 

Use with DJUMP_S for FOR . . . NEXT loops. 

The D port must be disabled to avoid bus 

contention. 

INITIALIZE LOOP 

Push the Address Reg. + 1 on the stack, load 
the counter from the A inputs and continue. 
Use with DJUMP_S for FOR . . . NEXT loops. 

INITIALIZE LOOP 

Push the Address Reg. + 1 on the stack and 
continue. Use with BRCC_S for 
REPEAT ... UNTIL loops, or with XTCC_D 
and BRA_S for WHILE . . . END WHILE loops. 



Execution Example 



STACK 

50 ( 1 0""~ ^'^ * ^ 

/ 

51 ^ O"— " 

COUNTER 



r 



STACK 
SO i O"^ "^ " 



34h 

38h 

35h 

39h 
3Ah 



POP_D 

POP_C 
PUSH_D 

PUSH_C 
SWAP 



Pop the stack and output the value on the D 
outputs and continue. The D port must be 
enabled. 

Pop the stack and store the value in the 
counter and continue. 

Push the D inputs on the stack and continue. 
The D port must be disabled to avoid bus 
contention. 

Push the counter on the stack and continue. 

Exchange the counter and the top of stack and 
continue. 



STACK 

<' D — 
/ 



52 <l 



STACK 

50 k Qy— D 

52 I 



STACK 
50,1 Q 



Note: Opcode numbers are in hexadecimal notation. 
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Opcode 
(I5-I0) 

3Bh 
3Ch 
3Dh 



Mnemonics 



STACK C 



LOAD D 



LOAD A 



Description 



Push the counter on the stack and load the 
counter wrth the value of the D inputs and 
continue. 

Load the counter with the value of the inputs 
and continue. The D port must be disabled to 
avoid bus contention. 

Load the counter with the value of tfie A inputs 
and continue. 



Execution Exampje 




(§P U D 

COUNTEft 



iOXJNTER 



30h 
32h 
36h 



CONT 
DECR 
RESET SP 



Continue. 

Decrement the counter and continue. 

Reset the stack pointer and continue. 



51® 



COUNTED 
W < ' O"^ COUMT-1 



^ 



PF001890 



3Eh 



3Fh 



CLEAR 



Load the comparison register with the value of 
the D inputs, enable the comparator and 
continue. 

Disable the comparator and continue. 



COMPARE 

y 

51®' 
52 I I 



Note; Opcode numbers are in hexadecimal notation. 
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APPLICATIONS 



Interrupt 
Veetof 



Address 



.. 



D A 

Test Ani29331 CP 

Y 



Microprogram 
Memory 



Pipeline Register CPl- 
I 



Clock 



I I 



A B 

Clocls 

Am293(32 
Inst. ALU 

Reg. 
Status Y 



BD006220 



Figure 8. Typical Control-Path Architecture For Am29300 Family 



ALU SlaUis , Am29331 
Rdgister Output Test Ir^is 



Am2d331 Outputs 



Mjcroprogram 
Memory Outputs 



(Clock to Register Status Outputs ol the Am29332) 



3^C 



(Test Inputs to Y Outputs) 



- Microprogram Memory Access Time— 



3S^( 



Figure 9. Cycle Timing Waveform* 

* This waveform shows the timing relationship for the configuration shown in Figure 8. 



Register Setup Time 
WF021091 
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Suggestions for Power and Ground Pin 
Connections 

The Am29331 operates in an environment of fast signal rise 
times and substantial switching cun'ents. Therefore, care must 
be exercised during circuit board design and layout, as with 
any high-performance component. The following is a sug- 
gested layout, but since systems vary widely in electrical 
configuration, an empirical evaluation of the intended layout is 
recommended. 

The VccT and GNDT pins, which carry output driver switching 
cunenls, tend to be electrically noisy. The VccE a"d GNDE 
pins, which supply the ECL core of the device, tend to produce 
less noise, and the circuits they supply may be adversely 
affected by noise spikes on the VccE plane. For this reason, it 
is best to provide isolation between the Vqce and VccT pins. 
as well as independent decoupling for each. Isolating the 
GNDE and GNDT pins is not required. 



Printed Circuit-Board layout Suggestions 

1. Use of a multi-layer PC board with separate power, ground, 
and signal planes is highly recommended. 

2. All Voce and VccT pins should be connected to the Vcc 
plane. Vcct pins should be isolated from VccE pins by means 
of a slot cut in the VccE plane; see Figure 10. By physically 
separating the VccE and VcCT pins, coupled noise will tie 
reduced. 

3. All GNDE and GNDT pins should be connected directly to 
the ground plane. 

4. The VccT pins should be decoupled to ground with a 0.1 -mF 
ceramic capacitor and a 10-/itF electrolytic capacitor, placed 
as closely to the Am29331 as is practical. VccE Pins should 
be decoupled to ground in a similar manner. 

A suggested layout is shown in Figure 10. 



ABCDEFGHJKLMN 




Isolation Cut 



• = Through Hole 

® = Vcc Pl^s Connection 



Ci=C3 = 



:10^F 



C2 = C4 = C6 = 0•■'^lF 



Figure 10. Suggested Printed Circuit-Board layout 



CD010890 



3-24 



o 

z 

CO 

m 
cc 

—I 

I 

(C 
lU 

X 




_L 



Parameter 


°C/W 


*JA Still Air 


21.8 


^JA 200 LFM 


7.7 


^JA 600 LFM 


5.1 


*JC Heat Sink 





200 400 600 

AIR VELOCITY (LINEAR FEET PER MINUTE) 

Figure 11. Ani29331 Thermal Characteristics (Typical) 
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ABSOLUTE MAXIMUM RATINGS 


OPERATING RANGES 

Commercial (C) Devices 






Tomnorfltiira Under Rias - Tr* -5*5 to +1?'5T Temoeraturd fTr) to +85*'C 


Supply Voltage to Ground Potential Supply Voltage (Vcc) + 4.75 to +5.25 V 

Continuous -05 to +70 V Air Velocity 200 linear feet per minute 


IX Voltage Applied to Outputs 
for High State -0.5 V to +Vcc Max Operating ranges define those limits between which the 


Stresses above those listed under ABSOLUTE MAXIMUM 
RATINGS may cause permanent device failure. Functionality 
at or above these limits is not implied. Exposure to absolute 
maximum ratings for extended periods may affect device 

DC CHARACTERISTICS over operating range 


Parameters 


Description 


Test Conditions 

(Note 1) 


Min* 


Max. 


Unit 


VOH 


Output HIGH Voltage 


Vcc - Min. 
V|N - VjL or V|H 


I0H--1.6 hlA for Yo-Yi5, INTA 


2.4 




Volts 


I0H--1.2 mA for Ail Others 


Vol 


Output LOW Voltage 


Vcc -Min. 

V|N - V|L or V|H 


IOL-16 mAfor Yo-Yi5, IRTS 




0.5 


Volts 


IOL-12 mA for All Others 


VlH 


Input HIGH Level 


Guaranteed Input Logical 
HIGH Voltage for All Inputs 


2.0 




Volts 


V|L 


Input LOW Level 


Guaranteed Input Logical 
LOW Voltage for All Inputs 




0.8 


Volts 


V| 


Input Clamp Voltage 


Vcc -Min., 
ltN--18 mA 




-1.5 


Volts 


l|L 


Input LOW Current 


Vcc - Max., 
VIN-0.5V 


Y0-Y16, D0-D15. INTA, 
A-FULL, EQUAL 




-0.55 


mA 


A0-A15, Mo-3, 0-3. 
I0-I5, To-Tiii 
S0-S3, FC, tS 




-0.50 


OED 




-1.0 


SLAVE, HOLD 




-1.5 


CP, INTR, INTEN 




-2.5 


RST 




-3.0 


l|H 


Input HIGH Current 


Vcc - Max., 

VIN-2.4V 


Y0-Y15, D0-D15, INTA, 
A-FULL, EQUAL 




100 


HA 


A0-A15, Mo-3. 0-3. 
lO-ls. To-Tiij. 
So -S3, FC, S 




50 


OED 




100 


SLAVE, HOLD 




150 


CP, INTR, INTEN 




250 


RST 




300 


ll 


Input HIGH Current 


Vcc - Max., 
V|N - 5.5 V 




1.0 


mA 


lOZH 
loZL 


Off Slate (High-Impedance) 
Output Current 


Vcc - Max. 


Vo - 2.4 V 




100 


(/A 


Vo - 0.5 V 




-550 


isc 


Output Short Qrcuit Current 
(Note 2) 


Vcc -Max. +0.5 V 
VOUT-+0.5 V 


-15 


-65 


mA 


Ice 


Power Supply Current 
(Note 3) 


Vcc - Max. 


COML Only 


Tc - to + 85°C 




1,300 


mA 


Tc=+85°C 




1,200 


Notes: 1. For conditions shown as Min. or Max., use ttie appropriate value specified under Operating Ranges for the applicable device type. 

2. Not more than one output should be shorted at a time. Duration of the shorl-oircuit test should not exceed one second. 

3. (Pleasured v»ith all inputs LOW and outputs disabled. 

4. It is the responsibility of the user to maintain a case temperature of + 85°C or less. AMD recommends an air velocity of at least 200 linear 
feet per minute over the heatsink. 
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SWITCHING CHARACTERISTICS over operating range (Note 1) 

A. COMBINATIONAL PROPAGATION DELAYS 



No. 


From 


To 


29331 


29331A 


Unit 


Max. Delay 


Max. Delay 


1 


Dl5-0 


Y15-O 


19 


17 


ns 




Dl5-0 


EQUAL 


23 


20 


ns 




Dl5-0 


ERROR 


25 


22 


ns 


2 


Al5-0 


Y15-O 


19 


17 


ns 




Al5-0 


EQUAL 


23 


20 


ns 




Al5-0 


ERROR 


25 


22 


ns 


3 


MX3-X0 


Y15-O 


19 


17 


ns 




MX3-X0 


EQUAL 


23 


20 


ns 




MX3-X0 


ERROR 


25 


.-25 


ns 




Yl5-0 


EQUAL 


20 


17 


ns 




Y15-0 


ERROR 


21 


,*•«!-.. 


ns 


4 


I5-0 


Y31-O 


25 


'-, 22 r 


ns 


5 


I5-O 


D15-O 


31 


"67- 


ns 




I5-0 


EQUAL 


29 


rs^- 


ns 




I5-O 


ERROR 


29 


% 25 


ns 


6 


T11-O 


Y15-O 


25 


:'"-2e' 


ns 




T11-O 


EQUAL 


30 


' 26 _ 


ns 




T11-O 


ERROR 


30 


•ffm^ 


ns 




S3-0 


Y15-O 


25 


"-■sz. 


ns 




S3-O 


EQUAL 


30 


s- .26,.'. 


ns 




S3-O 


ERROR 


30 


.^t 


ns 


7 


CP 


Y15-O 


20 


ns 


8 


CP 


Dis-O 


20/Z 


^/^ 


ns 


9 


CP 


A-FULL 


18 


#'W'"- 


ns 




CP 


EQUAL 


25 




ns 




CP 


ERROR 


30 


ns 


10 


rSt 


Y15-O 


26/Z 


m 


ns 




R5T 


D15-O 


Z 


ns 


11 


rST 


INTA 


12 


.^2- 


ns 




rST 


EQUAL 


27 


1^ 


ns 




R§T 


ERROR 


29 


ns 


12 


FC 


YlS-O 


21 


latWpS 


ns 


13 


FC 


D15-O 


23 


€^^ 


ns 




FC 


EQUAL 


26 


ns 




FC 


ERROR 


26 


w.jZS'tf 


ns 




INTR 


Y15-O 


Z 


z 


ns 


14 


INTR 


INTA 


11 


s 11 ■% 


ns 




INTR 


EQUAL 


(Note 2) 


liotei) 


ns 




INTR 


ERROR 


22 


'\tS^ 


ns 




INTEN 


Y15-0 


Z 


f^H 


ns 


15 


INTEN 


INIA 


11 


'■,11 ' 


ns 




INTEN 


EQUAL 


(Note 2) 


(ffels!^) 


ns 




INTEN 


ERROR 


22 


/fMS 


ns 




HOLD 


Y15-O 


Z 


■V^' 


ns 




HOLD 


INTA 


z 


", -2- 


ns 




HOLD 


A-FULL 


Z 


.^Z- 


ns 




HOLD 


EQUAL 


21 /Z 


ei/fe 


ns 




HOLD 


ERROR 


19 


17 


ns 




OED 


D15-O 


Z 


-' z 


ns 




OED 


ERROR 


19 


,1.17 ■ 


ns 




INTA 


ERROR 


19 


-■i" 


ns 




A-FULL 


ERROR 


19 


liit?4 


ns 




EQUAL 


ERROR 


19 


ns 


16 


^n 


Y16-O 


20 


^ 


ns 




^\r\ 


EQUAL 


25 


ns 




Cin 


ERROR 


26 


^i^ 


ns 




SLAVE 


YlS-O 


Z 


z 


ns 




SLAVE 


D15-O 


Z 


z 


ns 




SLAVE 


inta 


z 


z 


ns 




SLAVE 


A-FULL 


z 


z 


ns 




SLAVE 


EQUAL 


z 


z' 


ns 



Notes: See notes following Table C. 
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SWITCHING CHARACTERISTICS (Cont'd.) 








B. OUTPUT DISABLE TIME 








No. 


From 


To 


Description 


29331 


29331A 


Unit 


Max. Value 


Max. Value 




RST 


Y15-O 


Reset-to-Address Enable 


25 


25. 


ns 




RST 


Y15-O 


Reset-to-Address Disable 


25 


25 


ns 


43 


INTR 


Y15-O 


INTR-to-Address Enable 


25 


25 


ns 


44 


INTR 


Y15-O 


INTR-to-Address Disable 


25 


25 


ns 




INTEN 


Y15-O 


INTEN-to-Address Enable 


25 


2S 


ns 




INTEN 


Y15-O 


INTEN-to-Address Disable 


25 


25 


ns 




HOLD 


Y15-O 


HOLD-to-Address Enable 


25 


25 


ns 




HOLD 


Y15-O 


HOLD-to-Address Disable 


25 


as 


ns 




SLAVE 


Y15-O 


SLAVE-to-Address Enable 


25 


25 


ns 




SLAVE 


Y15-0 


SLAVE-to-Address Disable 


25 


25 


ns 




OED 


Y15-O 


OED-to-Data Enable 


25 


,25 


ns 




OED 


D15-0 


OED-to-Data Disable 


25 


2S 


ns 




RST 


D15-O 


Reset-to-Data Enable 


25 


25 


ns 




RST 


D15-O 


Reset-to-Data Disable 


25 


25 


ns 




SLAVE 


D15-0 


SLAVE-to-Data Enable 


25 


25 


ns 




SLAVE 


D15-O 


SLAVE-to-Data Disable 


25 


25 


ns 




CP 


D15-O 


Clook-to-Data Enable 


30 


30 


ns 




CP 


D1R-0 


Clock-to-Data Disable 


30 


30 


ns 




HOLD 


INTA 


HOLD-to-INTA Enable 


25 


25 


ns 




HOLD 


INTA 


HOLD-to-INTA Disable 


25 


25 


ns 




HOLD 


A-FULL 


HOLD-to-A-FULL Enable 


25 


25 


ns 




HOLD 


A-FULL 


HOLD-to-A-FULL Disable 


25 


25 


ns 




HOLD 


EQUAL 


HOLD-to-EQUAL Enable 


25 


25 


ns 




HOLD 


EQUAL 


HOLD-to-EQUAL Disable 


25 


25 


ns 




SLAVE 


INTA 


SLAVE-to-INIA Enable 


25 


25 


ns 




SLAVE 


INTA 


SLAVE-to-INTA Disable 


25 


2S 


ns 




SLAVE 


A-FULL 


SLAVE-to-A-FULL Enable 


25 


25 


ns 




SLAVE 


A-FULL 


SLAVE-to-A-FULL Disable 


25 


25 


ns 




SLAVE 


EQUAL 


SLAVE-to-EQUAL Enable 


25 


25 


ns 




SLAVE 


EQUAL 


SUWE-to-EQUAL Disable 


25 


25 


ns 


Notes; See notes following Table C. 
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SWITCHING CHARACTERISTICS (Cont'd.) 








C. SETUP AND HOLD TIMES 








No. 


Parameter 


For 


With Respect To 


29331 


29331 A 


Unit 


Max. Value 


Max. Value 


17 


Data Setup 


D15-O 


CP 


8 


.8- ' . 


ns 


18 


Data Hold 


D15-O 


CP 


4 


■-4, . 


ns 


19 


Alternate Data Setup 


A15-O 


CP ' 


8 


8' 


ns 


20 


Alternate Data Hold 


A15-O 


CP ' 


3 


••S'^ 


ns 


21 


Multiway Setup 


MX3-X0 


CP ' 


8 


■b:;-. 


ns 


22 


Multiway Hold 


Mx3-X0 


CP • 


2 


2. ■•■■■■ 


ns 


23 


Address Setup 


Y15-O 


CP • 


5 


■&:.-:'„.. 


ns 


24 


Address Hold 


Y15-O 


CP 


3 


3^--- 


ns 


25 


Instruction Setup 


I5-O 


CP 


11 


11 


ns 


26 


Instruction Hold 


I5-0 


CP 


1 


i^■■■ :■ 


ns 


27 


Forced Continue Setup 


FC 


CP 


11 


^■y-'^ 


ns 


28 


Forced Continue Hold 


FC 


CP 





.0 -■:. 


ns 


29 


Test Setup 


T11-O 


CP 


16 


m ..-~ 


ns 


30 


Test Hold 


T11-O 


CP 





■0,"-., 


ns 


31 


Select Setup 


S3-0 


CP 


16 


16.-^' 


ns 


32 


Select Hold 




CP ' 





*o ' ■ 


ns 


33 


Reset Setup 


RST 


CP ' 


15 


16-- '' 


ns 


34 


Reset Hold 


RST 


CP ' 


2 


^^-v\ 


ns 


35 


Interrupt Request Setup 


INTR 


CP ■ 


8 


-a. , 


ns 


36 


Interrupt Request Hold 


INTR 


CP. ' 


2 


,2.- ' 


ns 


37 


Interrupt Enable Setup 


INTEN 


CP ' 


8 


Sii. . 


ns 


38 


Interrupt Enable Hold 


INTEN 


CP ' 


2 


■s !■ 


ns 


39 


Hold Mode Setup 


HOLD 


CP • 


5 


'S'-"'*'' 


ns 


40 


Hold Mode Hold 


HOLD 


CP ■' 


3 


.3'--" 


ns 


41 


Carry-In Setup 


Cin 


CP T 


10 


id-.v 


ns 


42 


Cany-ln Hold 


Cin 


CP T 








ns 


Notes: 1. It is the responsibility of the user to maintain a case tempen 


iture of + 85°C 


less. AMD recommends 


an air velocity of at least 200 linear feet per minute ove 


r the heatsink. 






2. (INTR, INTEN)-to-EQUAL is the sum of (INTR, INTEN)-to-Y 


disable time and 


Y-to-EQUAL de 


ay time. 


This is not tested due to bus turnaround in Master/Slave 


mode. 






3. The status of I5-I0 and FC must not be changed durinc 


) the Clock LO\A 


1 time. 




4. Cl = 50 pF; Cl = 5 pF for Disable Time only. 








5. Z = Three-state output path; use Table B. 
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SWITCHING TEST CIRCUIT 



VOUT 




Ri =240 n 



A. Three-State Outputs 

Notes: 1 . Cl = 50 pF includes scope probe, wiring, and stray capacitances without device in test fixture. 

2. Si, Sg, S3 are closed during function tests and all AC tests except output enable tests. 

3. Si and S3 are closed while S2 is open for tpzH test. 
Si and S2 are closed while 83 is open for tpzL test. 

4. Cl = 5.0 pF for output disable tests. 
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SWITCHING TEST WAVEFORMS 



DATA 

input" 



TIMtNG 
INPUT " 



i 



1 



Notes; 1. Diagram shown for HIGH data only. Output 
transition may be opposite sense. 
2. Cross hatched area Is don't care condition. 

Setup, Hold, and Release Times 



3 V 


LOW-HIGHLOW 


K \ 






PULSE / 


r \ -■ 






'PW 


3V 


HIGH-LOWHIGH \ 


[ / 






k / 




102970 




Pulse Width 


WFR02790 



-Jf=\ 



:F=^ 



\=^ 



- 1.5 V 

■ V 

VOH 

Vol 

■ 3 V 

■ 1.5 V 

- V 



Propagation Delay 



OUTPUT 

NORMALLY 

LOW 



:^ 



OUTPUT 
NORMALLY 

^^IG^ S5OPEN 



-n^l.5 V 
/ -0 V 



- 1.5 V 

- V 



0.5 V 



,5 V 

Vol 



^ 



WFR02663 

Notes: 1. Diagram shown for Input Control Enable-LOW 
and Input Control Disable-HIGH. 
2. Si, S2, and S3 of Load Circuit are closed 
except where shown. 

Enable and Disable Times 
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Notes on Test Methods 

The following points give the general philosophy which we 
apply to tests which must be properly engineered if they are to 
be innplemented in an automatic environment. The specifics of 
what philosophies applied to which test are shown. 

1 . Ensure the part is adequately decoupled at the test head. 
Large changes in supply current when the device switches 
may cause function failures due to Vcc changes. 

2. Do not leave inputs floating during any tests, as they may 
oscillate at high frequency. 

3. Do not attempt to perform threshold tests at high speed. 
Following an input transition, ground cun-ent may change by 
as much as 400 mA in 5 - 8 ns. Inductance in the ground 
cable may allow the ground pin at the device to rise by 
hundreds of millivolts momentarily. 

4. Use extreme care in defining input levels for AC tests. Many 
inputs may be changed at once, so there will be significant 
noise at the device pins which may not actually reach V|l or 
V|H until the noise has settled. AMD recommends using 
V|L<0 V and V|h >3 V for AC tests. 

5. To simplify failure analysis, programs should be designed to 
perform DC, Function, and AC tests as three distinct groups 
of tests. 

6. Capacitive Loading for AC Testing 

Automatic testers and their associated hardware have stray 
capacitance which varies from one type of tester to 
another, but is generally around 50 pF. This mal<es it 
impossible to make direct measurements of parameters 
which call for a smaller capacitive load than the associated 
stray capacitance. Typical examples of this are the so- 
called "float delays" which measure the propagation 
delays into and out of the high-Impedance state, and are 
usually specified at a load capacitance of 5.0 pF. In these 
cases, the test is performed at the higher load capacitance 
(typically 50 pF), and engineering correlations based on 
data taken with a bench setup are used to predict the re- 
sult at the lower capacitance. 



Similarly, a product may be specified at more than one 
capacitive load. Since the typical automatic tester is not 
capable of switching loads in mid-test, it is impossible to 
make measurements at both capacitances even though 
they may both be greater than the stray capacitance. In 
these cases, a measurement is made at one of the two 
capacitances. The result at the other capacitance Is 
predicted from engineering correlations based on data 
taken with a bench setup and tlie knowledge that certain 
DC measurements (Iqh. lOLf 'Of example) have already 
been taken and are within specification. In some cases, 
special DC tests are performed in order to facilitate this 
correlation. 

7. Threshold Testing 

The noise associated with automatic testing, the long 
inductive cables, and the high gain of bipolar devices when 
in the vicinity of the actual device threshold frequently give 
rise to oscillations when testing high-speed circuits. These 
oscillatrons are not indicative of a reject device, but instead, 
of an overtaxed test system. To minimize this problem, 
thresholds are tested at least once for each input pin. 
Thereafter, "hard" high and low levels are used for other 
tests. Generally this means that function and AC testing are 
performed at "hard" Input levels rather than at V|l max. 
and V|H min. 

8. AC Testing 

Occasionally parameters are specified whteh cannot be 
measured directly on automatic testers because of tester 
limitations. Data input hold times often fall into this catego- 
ry. In these cases, the parameter in question is guaranteed 
by correlating these tests with other AC tests which have 
been perfomned. These correlations are arrived at by the 
cognizant engineer by using data from precise bench 
measurements in conjunction with the knowledge that 
certain DC parameters have already been measured and 
are within specification. 

In some cases, certain AC tests are redundant since they 
can be shown to be predicted by other tests virtiioh have 
already been performed. In these cases, the redundant 
tests are not performed. 



SWITCHING WAVEFORMS 
KEY TO SWITCHING WAVEFORMS 



m 



DON'T CARE; 
ANY CHANGE 
PERMITTED 



W€ 



MILL BE 
CHANGING 
FROM H TO L 



WILL BE 
CHANGING 
FROM L TO H 



CHANGING; 

STATE 

UNKNOWN 



CENTER 
LINE IS HIGH 
IMPEDANCE 
"OFF" STATE 



KS000010 
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SWITCHING WAVEFORMS (Cont'd.) 




:a .^ 



^ f 



INPUT 
_ TO . 
OUTPUT 
TO DELAY 

OUTPUT 
DELAY 



CLOCK 



WFR02990 







-C\ 


rcLEi- 






-CYCLE 2 ► 










i 
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\ / 




\ 


/ 


~^ 


HOLD~V,, 










It 
. .^1 








BESET r^*^ 








INTEN / >•'• 




•- 


(Nolel) 














INin 






"» 






-» 


h— ® 




iSffS 


r 


/ 


*-@ 










@- 


' 


4- 


-» 




Vc, ^ 


^,\ 


YoFf 


•I- '*°» 




1 


-•• 


•-(Note 2) 




HT-VECTBUFFEH VECTofF 


t 


VECt'on Jjt- VECTofF 




♦-@-»| 




ADDRESS REGISTER A-1 -^ 


A 


He B 


^ 


B*l 


^ 


B*2 






NTBIRUPr RETURN ., V 
ADDRESS REGISTER A 


A 


^ - 


T~ 


B*l 


^ 


B»2 


(Note 3) 
















1 











Interrupt Timing 

Notes: 1. Interrupt Request comes from an interrupt-controller register. If reflects the CP f to INTR time of 
ttie interrupt controller. 

2. During Cycle 2, there may be contention on the Y-bus if the Y-bus is turned ON before the INT- 
VECT buffer is turned OFF. 

3. Refer to Figures 4 and 5 for definition of A and B. 
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SWITCHING WAVEFORMS (Cont'd.) 



Reset Timing 



CP / 


—® ►! 


\ 


f« 


— ® 


—►» 




Y ! 


-(S^- 


% 




J 


-*i 




INTA 


J 





CP 



Z3< 



T.S 



WIX 



A-FULL 



-<z)- 



\. 



X 



^— (?) — » 



)C 



- — ©- 



-@- 



-®- 



=^ 



^ 



(^ 



^3c 



)C 



-®- 



)C 




iTi^ 






/ 



"^v 



^ 



-<Sh- 



\. 



■o ► 



V 
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INPUT/OUTPUT INTERFACE CONDITIONS 
(All Devices) 

DRIVING OUTPUT I DRIVEN INPUT 




ICR004SO 



3-35 



Am29332 

32-Bit Arithmetic Logic Unit 



> 
i 

<o 

W 
W 

K9 



DISTINCTIVE CHARACTERISTICS 



Single Chip, 32-Bit ALU 

Supports 80-90 ns microcycle time for ttie 32-bit 
data patii. It Is a combinatorial ALU with equal cy- 
cle time for all instructions. 
Flow-through Architecture 
A combinatorial ALU with two input data ports and 
one output data port allows implementation of either 
parallel or pipelined architectures. 
64-Blt In, 32-Blt Out Funnel Shifter 
This unique functional block allows n-bil shift-up, 
shift-down, 32-bit barrel shift or 32-bit field extract. 



Supports All Data Types 

It supports one-, two-, three- and four-byte data for 
all operations and variable-length fields for logical 
operations. 

Multiply and Divide Support 
Built-in hardware to support two-bit-at-a-time modi- 
fied Booth's algorithm and one-bit-at-a-time division 
algorithm. 

Extensive Error Checking 
Parity check and generate provides data transmis- 
sion check and master/slave mode provides com- 
plete function checking. 



GENERAL DESCRIPTION 



The Am29332 is a 32-bit wide non-cascadable Arithmetic 
Logic Unit (ALU) with integration of functions that normally 
don't cascade, such as barrel shifters, priority encoders 
and mask generators. Two input data ports and one output 
data port provkle flow-through architecture and allow the 
designer to implement his/her architecture with any degree 
of pipelining and no built-in penalties for brancNng. Also, 
the simplicity of a three-bus ALU allows easy implementa- 
tion of parallel or reconfigurable architectures. The register 
file is off-chip to allow unlimited expansion and regular 
addressability. 

The Am29332 supports one-, two-, three- and four-byte 
data for arithmetk: and logic operations. It also supports 



multiprecision arithmetic and shift operations. For logical 
operations, it can support variable-length fields up to 32 
bits. When fewer than four bytes are selected, unselected 
bits are passed to the destination without modification. The 
device also supports two-bit-at-a-time modified Booth's 
algorithm for high-speed multiplication and one-bit-at-a- 
time diviskin. Both signed and unsigned integers for all byte 
aligned data types mentioned above are supported. 

The Am29332 is designed to support 80-90 ns microcycle 
time. The device is packaged in a 169-lead pin-grid-array 
package. 



SIMPLIFIED BLOCK DIAGRAM 
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Publication # Rev. Amendment 

05730 E /O 
Issue Date: July 1987 



RELATED AMD PRODUCTS 



Part No. 


Description 


Am29Cui 


CMOS 4-Bit Microprocessor Slice 


Am29C10A 


CMOS 12-Bit Sequencer 


Am29C101 


CMOS 16-Bit Microprocessor 


An29112 


8-Bit Cascadable Microprogram Sequencer 


Am29114 


Real-Time Interrupt Controller 


Am29C116 


CMOS 16-Bit Microcontroller 


Anl29C323 


CMOS 32x32 Parallel MulUplier 


Am29325 


32-Bit Floating Point Processor . 


Am29C325 


CMOS 32-Bit Floating Point Processor 


Am 29331 


16-Bit Microprogram Sequencer 


Am29C331 


CMOS 16-Bit Microprogram Sequencer 


Am29334 


64x18 Four-Port, Dual-Access Register File 


Am29C334 


CMOS 64x18 Four-Port, Dual-Access Register Hie 


Am 29337 


16-Bit Bounds Checker 


Ani29338 


32-Bit Byte Queue 


Am29C516 


CMOS 16x16 Multiplier 


Am29C517 


CMOS 16x16 Multiplier with Separate I/O 



CONNECTION DIAGRAM 
169-Leacl PGA 
Bottom View 
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Y3 
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Y7 
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Y8 


Y12 
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Y» 


BORCW V 


L 


vcc 


PY1 


GNDT 


GNO 


YZ 


YS 


Y6 


vcc 


Y11 


YlO 


Y13 


Y1S 


Y18 


Yl7 


HOLD Z 


GNO 


PYO 


vo 


PERR 
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Y1 


Y4 


GNOT 


vcc 


Yd 


VCC 


GNO 


Y14 


Y« 
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CD010462 



* This pin is not used 



Key: VCCE = VCC, ECL 
VCCT-VCC, TIL 
GNDE = GND. ECL 
GNDT = GND, TTL 
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PIN DESIGNATIONS 

(Sorted by Pin No.) 


PIN NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


A-1 


DBe 


1 


C-9 


W3 


145 


J-15 


GND, TTL 


105 


R-10 


Y31 


66 


A-2 


DA5 


164 


C-10 


lo 


139 


J-16 


Ys 


101 


R-11 


GND, ECL 


64 


A-3 


DB4 


161 


C-11 


GND, ECL 


143 


J-17 


Y4 


102 


R-12 


Vcc. TTL 


71 


A-4 


DB2 


157 


C-12 


I5 


134 


K-1 


DB16 


27 


R-1 3 


Y25 


74 


A-5 


DBi 


155 


C-13 


CP 


130 


K-2 


PAi 


25 


R-1 4 


GND, TTL 


79 


A-6 


DBo 


153 


C-14 


SLAVE 


127 


K-3 


DA15 


24 


R-1 5 


Yl9 


82 


A-7 


P1 


148 


C-15 


N 


120 


K-15 


Y7 


99 


R-16 


Yi5 


88 


A-8 


Pz . 


149 


C-16 


L 


118 


K-16 


Ye 


100 


R-1 7 


Yl4 


89 


A-9 


wa 


142 


C-17 


GND, TTL 


117 


K-17 


GND, TTL 


98 


T-1 


DA23 


42 


A-10 


I2 


137 


D-1 


DBb 


7 


L-1 


PBi 


26 


T-2 


DB23 


41 


A-11 


I3 


136 


D-2 


PBo 


6 


L-2 


DA16 


28 


T-3 


DA24 


46 


A-12 


l6 


133 


D-3 


PAo 


5 


L-3 


Vcc. ECL 


22 


T-4 


DA26 


48 


A-13 


l8 


131 


D-15 


C 


119 


L-15 


Vcc. ECL 


103 


T-5 


DA27 


52 


A-14 


MLINK 


129 


D-16 


Vcc, TTL 


116 


L-16 


Vcc. ECL 


103 


T-6 


DA28 


54 


A-15 


M/rn 


125 


D-17 


PYo 


115 


L-17 


Vcc. ECL 


103 


T-7 


DA30 


58 


A-16 


BOROW 


124 


E-1 


DB9 


9 


M-1 


DB18 


31 


T-8 


DA31 


60 


A-17 


HOLD 


123 


E-2 


DA9 


10 


M-2 


DA17 


30 


T-9 


PA3 


61 


B-1 


DAe 


2 


E-3 


DAs 


8 


M-3 


DB17 


29 


T-10 


Y30 


67 


B-2 


DB5 


163 


E-15 


PY3 


112 


M-15 


Ye 


96 


T-11 


Y27 


70 


B-3 


□A3 


160 


E-16 


PYi 


114 


M-16 


Y11 


93 


T-12 


GND, TTL 


72 


B-4 


DAg 


158 


E-17 


Yo 


109 


M-17 


Yg 


95 


T-13 


Y23 


76 


B-5 


DAi 


156 


F-1 


DB10 


11 


N-1 


DB19 


33 


T-14 


Vcc. TTL 


78 


B-6 


Ps 


152 


F-2 


DB11 


13 


N-2 


DA19 


34 


T-15 


Y21 


80 


B-7 


P3 


150 


F-3 


DA10 


12 


N-3 


DA18 


32 


T-16 


Yia 


83 


B-8 


Po 


147 


F-15 


PY2 


113 


N-15 


Y12 


92 


T-17 


Y16 


86 


B-9 


Wi 


141 


F-16 


GND, TTL 


110 


N-16 


Y10 


94 


U-1 


PA2 


43 


B-10 


Wo 


140 


F-17 


PERR 


111 


N-17 


Vcc. TTL 


97 


U-2 


PB2 


44 


B-11 


I1 


138 


G-1 


DA11 


14 


P-1 


DB20 


35 


U-3 


DB24 


45 


B-12 


I4 


135 


G-2 


DAi2 


16 


P-2 


DAso 


36 


U-4 


OB26 


49 


B-13 


I7 


132 


G-3 


GND, ECL 


21 


P-3 


DB21 


37 


U-5 


DB27 


51 


B-14 


RS 


128 


G-15 


GND, ECL 


104 


P-15 


OE-Y 


87 


U-6 


DB29 


55 


B-15 


MCin 


126 


G-16 


GND, ECL 


104 


P-16 


Yi3 


90 


U-7 


DA29 


56 


B-16 


V 


121 


G-17 


GND, ECL 


104 


P-17 


GND, TTL 


91 


U-8 


DB30 


57 


B-17 


z 


122 


H-1 


DB12 


15 


R-1 


DB22 


39 


U-9 


PB3 


62 


C-1 


DB7 


3 


H-2 


DAia 


18 


R-2 


DA21 


38 


U-10 


Y2a 


69 


C-2 


DA7 


4 


H-3 


DB13 


17 


R-3 


DA22 


40 


U-11 


Y29 


68 


C-3 


DA4 


162 


H-1 5 


Y3 


106 


R-4 


DB25 


47 


U-12 


Y26 


73 


C-4 


DB3 


159 


H-16 


Y2 


107 


R-5 


DA26 


50 


U-13 


Y24 


75 


C-5 


DAo 


154 


H-17 


Y1 


108 


R-6 


DB28 


53 


U-14 


Y22 


77 


C-6 


P4 


151 


J-1 


DAu 


20 


R-7 


Vcc, ECL 


63 


U-15 


Y20 


81 


C-7 


Vcc. ECL 


144 


J-2 


DBi 4 


19 


R-8 


DB31 


59 


U-16 


Yl7 


84 


C-8 


W4 


146 


J-3 


DB16 


23 


R-9 


MSERR 


65 


U-17 


GND, TTL 


85 
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PIN DESIGNATIONS 
(Sorted by Pin Names) 


PiN NAME 


PIN 
NO. 


PAD 
NO. 


PIN NAME 


PIN 
NO. 


PAD 
NO. 


PIN NAME 


PIN 
NO. 


PAD 
NO. 


PIN NAME 


PIN 
NO. 


PAD 
NO. 


BOROW 


A-16 


124 


DB7 


C-1 


3 


I2 


A-10 


137 


Vcc, TTL 


T-1 4 


78 


C 


D-15 


119 


DBg 


D-1 


7 


I3 


A-11 


136 


Vcc TTL 


N-1 7 


97 


CP 


C-13 


130 


DB9 


E-1 


9 


I4 


B-1 2 


135 


Vcc. TTL 


D-1 6 


116 


DAo 


C-5 


154 


DB10 


F-1 


11 


I5 


C-12 


134 


Vcc. TTL 


H-1 2 


71 


DAi 


B-5 


156 


DB11 


F-2 


13 


l6 


A-1 2 


133 


Wo 


B-10 


140 


DA2 


B-4 


168 


DB12 


H-1 


15 


I7 


B-1 3 


132 


W, 


B-9 


141 


DA3 


B-3 


160 


DB13 


H-3 


17 


l8 


A-1 3 


131 


W2 


A-9 


142 


DA4 


C-3 


162 


DB14 


J-2 


19 


L 


C-1 6 


118 


W3 


C-9 


145 


DA5 


A-2 


164 


DB15 


J-3 


23 


MC,n 


B-1 5 


126 


W4 


C-8 


146 


DAe 


B-1 


2 


DB16 


K-1 


27 


MLINK 


A-14 


129 


Yo 


E-1 7 


109 


DA7 


C-2 


4 


DB17 


M-3 


29 


M/in 


A-15 


125 


Y1 


H-1 7 


108 


DAg 


E-3 


8 


DB18 


M-1 


31 


MSERR 


R-9 


65 


Y2 


H-1 6 


107 


DA9 


E-2 


10 


DB19 


N-1 


33 


N 


C-15 


120 


Y3 


H-1 5 


106 


DA10 


F-3 


12 


DB20 


P-1 


35 


OE-Y 


P-1 5 


87 


Y4 


J-1 7 


102 


DA11 


G-1 


14 


DB21 


P-3 


37 


Po 


B-8 


147 


Ys 


J-1 6 


101 


DA12 


G-2 


16 


DB22 


R-1 


39 


Pi 


A-7 


148 


Ye 


K-1 6 


100 


DA13 


H-2 


18 


DB23 


T-2 


41 


P2 


A-8 


149 


Y7 


K-1 5 


99 


DA14 


J-1 


20 


DB24 


U-3 


45 


P3 


B-7 


150 


Y8 


M-1 5 


96 


DA15 


K-3 


24 


DB26 


R-4 


47 


P4 


C-6 


151 


Y9 


M-1 7 


95 


DA16 


L-2 


28 


DB26 


U-4 


49 


P5 


B-6 


152 


Y10 


N-1 6 


94 


DA17 


M-2 


30 


DB27 


U-5 


51 


PAo 


D-3 


5 


Y11 


M-1 6 


93 


DA18 


N-3 


32 


DB28 


R-6 


53 


PAi 


K-2 


25 


Y12 


N-1 5 


92, 


DA19 


N-2 


34 


DB29 


U-6 


55 


PA? 


U-1 


43 


Yl3 


P-1 6 


90 


DA20 


P-2 


36 


DB30 


U-8 


57 


PA3 


T-9 


61 


Yl4 


R-17 


89 


DA21 


R-2 


38 


DB31 


R-8 


59 


PBo 


D-2 


6 


Yl5 


R-16 


88 


OA22 


R-3 


40 


GND, ECL 


G-3 


21 


PBi 


L-1 


26 


Y16 


T-1 7 


86 


DA23 


T-1 


42 


GND, ECL 


R-11 


64 


PB2 


U-2 


44 


Yl7 


U-1 6 


84 


DA24 


T-3 


46 


GND, ECL 


G-1 7 


104 


PB3 


U-9 


62 


Y18 


T-1 6 


83 


DA25 


T-4 


48 


GND, ECL 


G-1 5 


104 


PERR 


F-1 7 


111 


Y19 


R-1 5 


82 


DA26 


R-5 


50 


GND, ECL 


G-1 6 


104 


PYo 


D-1 7 


115 


Y20 


U-1 5 


81 


DA27 


T-5 


52 


GND, ECL 


C-11 


143 


PYi 


E-1 6 


114 


Y21 


T-1 5 


80 


DA28 


T.6 


54 


GND, TTL 


T-1 2 


72 


PY2 


F-1 5 


113 


Y22 


U-14 


77 


DA29 


U-7 


56 


GND, TTL 


R-1 4 


79 


PY3 


E-1 5 


112 


Y23 


T-1 3 


76 


DA30 


T-7 


58 


GND, TTL 


U-17 


85 


RS 


B-1 4 


128 


Y24 


U-1 3 


75 


DA31 


T-8 


60 


GND, TTL 


P-1 7 


91 


SLAVE 


C-1 4 


127 


Y2S 


R-1 3 


74 


DBo 


A-6 


153 


GND, TTL 


K-1 7 


98 


V 


B-1 6 


121 


Y26 


U-1 2 


73 


DB, 


A-5 


155 


GND, TTL 


J-1 5 


105 


Vcc, ECL 


R-7 


63 


Y27 


T-11 


70 


DB2 


A-4 


157 


GND, TTL 


F-1 6 


110 


Vcc. ECL 


L-1 6 


103 


Y28 


U-10 


69 


DB3 


C-4 


159 


GND, TTL 


C-1 7 


117 


Vcc. ECL 


L-15 


103 


Y29 


U-11 


68 


DB4 


A-3 


161 


HOLD 


A-1 7 


123 


Vcc. ECL 


L-1 7 


103 


Y30 


T-10 


67 


DB5 


B-2 


163 


lo 


C-10 


139 


Vcc. ECL 


C-7 


144 


Y3I 


R-10 


66 


DBe 


A-1 


1 


I1 


B-11 


138 


Vcc. ECL 


L-3 


22 


Z 


B-1 7 


122 
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LOGIC SYMBOL 



i2 



^ 



DA0-DA3, PA0-PA3 PBo-PBg DB^-DB 

SLAVE pERR 

BOROW 
MCrn 
MLINK 

CP 

HOLD 

RS 




METALLIZATION AND PAD LAYOUT 



^L^MUyi 



ODQaDQQC 
^ < < <!< 




Die size: 367x387 mils 
Gate Count: 5200 
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ORDERING INFORMATION 
Standard Products 



AMD standard products are available in several pacl<ages and operating ranges. The order number (Valid 
Combination) is formed by a combination of: a. Device Number 

b. Speed Option (if applicable) 

c. Package Type 

d. Temperature Range 

e. Optional Processing 



-a. DEVICE NUMBER/DESCRIPTION 

Am29332/Am29332A 
32-Brt Arithmetic Logic Unit 



-e. OPTIONAL PROCESSING 

Blank = Standard processii^ 
B - Burn-in 



-d. TEMPERATURE RANGE 

C - Commercial (0 to +85°C) 



-c. PACKAGE TYRE 

G = 169-Lead Pin Grid Array with Heatsinl< 
(CG 169) 



b. SPEED OPTION 

Not Applicable 



Valid Combinations 


AM29332 


GC, GCB 


AM29332A 



Valid Combinations 

Valid Combinations list configurations planned to be 
supported in volume for this device. Consult the local AMD 
sales office to confirm availability of specific valid 
combinations, to check on newly released combinations, and 
to obtain additional data on AMD's standard military grade 
products. 
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PIN DESCRIPTION 



BOROW Borrow (Input) 

When HIGH, the Carry In and Carry Out are bon-ows for 
subtract operations. 

C, Z, N, V, L Status (Input/Output) 

When the Register Status pin is LOW, these pins give the 
Carry, Zero, Negative, Overflow and Link outputs of the ALU 
where applicable to the instruction being executed. When 
not applicable to the Instruction being executed, or when the 
Register Status pin is HIGH, these pins give the outputs of 
the Cany, Zero, Negative, Overflow and Link bits of the 
Internal Status Register. In Slave mode, C, Z, N, V and L 
become Inputs. 

CP Clock Input (Input) 

Clocks internal registers (status, 0) at the LOW to HIGH 
transition, provided HOLD input is LOW. 

DA0-DA31 Data Input for DA-bus (Input) 

Data Input lines for operand A. 

DB0-DB31 Data Input for DB-bus (Input) 

Data input lines for operand B. 

HOLD Hold (Input, Active HIGH) 

When HIGH, It Inhibits the update of the status and Q 
registers. 

lo - 16 Instruction Inputs (Input) 

Used to select the operation to be performed. 
I7-I8 Byte Width Inputs (Input) 

Byte width inputs for byte boundary aligned operand 
Instructions. Selects the sources for width and position 
Inputs for variable field bit operands. If I7 is LOW it selects 
the width input from pins W4-W0. If I7 is HIGH the width 
input Is selected from the Internal width register. Similarly if 
la is LOW it selects the position Inputs from pins P5 - Pq and 
if HIGH It selects input from the Internal position register. 
MCln Macro Status Carry (Input) 
External Carry input. 

MLINK Macro Status Unk (Input) 

External link Input, 

M/m Macro/Micro Select (Input) 

When HIGH, selects macro carry and macro link pins as 
input instead of micro carry and micro link from the micro- 
status register. 



MSERR Master-Slave Error (Output) 

When HIGH, this signal indicates that the master's and 
slave's data were not identical. 

OE-Y Outpu t Enable (Input, Active LOW) 

When OE-Y is HIGH the Y-bus Is disabled (three-stated). 
P0-P5 Position Inputs (Input) 
Position input to select the position of the least significant bit 
of a field. Also Indicates the amount by which data is to be 
shifted up (P5 = LOW) or down (P5 = HIGH) or rotated. 

PA0-PA3 Parity Input for DA-bus (Input) 

Parity input for operand A on DA-bus (one per byte). 
Even parity is used for the Am29332. 

PB0-PB3 Parity Input for OB-bus (Input) 

Parity input for operand B on DB-bus (one per byte). 

PERR Parity Error (Input/Output) 

When HIGH, indicates that a parity error was detected on 
the DA or DB inputs. 

PY0-PY3 Parity for Y-bus (Input/Output) 

Parity output tor data on Y-bus (one per byte). Even parity is 
used for the Am29332. In slave mode, PYq - PY3 become 
inputs. 

RS Register Status Mode Pin (Input) 
Selects between ALU status (Register Status = LOW) or 
register status (Register Status = HIGH) on the C, Z, N, V 
and L outputs. 

SLAVE Slave (Input) 

When HIGH, this pin puts the ALU In the slave mode. All 
output pins become input pins and signals on them are 
compared with the ALU's internally generated results. When 
OE-Y Is HIGH, the Y0-Y31 and PY0-PY3 Inputs are 
ignored. When the SLAVE pin is LOW, the ALU is put in 
master mode where outputs are generated as normal. 

W0-W4 Width Inputs (Input) 
Width input to select the width of a contiguous bit field. 

Y0-Y31 Da ta Out/In Lines (Input/Output) 
When OE-Y is LOW and the ALU is in the Mastei^mode, the 
ALU result is enabled on the Y-bus. When OEy Is HIGH, 
the Y-bus is three-staled. In Slave mode the Y-bus acts as 
external data input. 
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Figure 1. Detailed Blocl< Diagram 
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Figure 2. Am29332 Family High-Performance System Block Diagram 



PRODUCT OVERVIEW 

The Am29332 is a 32-bit wide, high-performance, non-expand- 
able Arithmetic Logic Unit (ALU). It has two 32-bit wide input 
ports (A and B) and one 32-bit wide output port (Y). These 
three ports provide flexibility and accessibility for high-perfor- 
mance processor designs. Dedicated input and output ports 
provide a flow-through architecture and avoid the penalty 
associated with switching the bus half-way through the cycle 
for input and output of data. The chip is designed for use vwth 
a dual-access RAM (Am29334) as a register file. In addition, 
the three-bus architecture facilitates the connection of other 
arithmetic units in parallel with the Am29332 for high-perfor- 
mance systems. 

The Am29332 supports one-, two-, three-, and four-byte 
arithmetic, operations. It also supports multiprecision arithme- 
tic and multiple bit shifts. For logical operations, it can handle 
variable-length fields of up to 32 bits. The chip incorporates 
dedicated hardware to allow efficient implementation of a two 
bit-at-a-time (modified Booth) multiply algorithm, supporting 
signed and unsigned arithmetic data types. Similarly, hardware 
is provided to support a bit-at-a-time divide algorithm, also 
supporting signed and unsigned arithmetic data types. An 
internal 32-bit register (Q) is used by the multiply and divide 
hardware for double precision operands. For business applica- 
tions, the Am29332 supports variable-length BCD arithmetic. 

Field logical instructions operate on bit-fields taken from the A 
and B data inputs; they may be of variable width and starting 
position. A Is normally the source input and B the destination 
input. In general, destination bits not falling within a specified 
field are passed by the ALU unchanged. Field width and 
position are specified either by direct inputs to the chip, or by 
entries in the status register. There are two kinds of field 
logical instructions - aligned and non-aligned. The first type of 
instruction assumes that source and destination fields are 
aligned and the operation is performed only for bits within the 
specified fields. In the second type of instruction, source and 
destination fields are normally non-aligned. However, it is 
always assumed that one field (either source or destination) is 
least-significant-bit (LSB) aligned. 

If the destination field is LSB aligned then the source field is 
downshifted in order to make it LSB aligned as well. Down- 



shifting is accomplished by making the 6-bit position input 
equal to the two's complement of the number of places the 
field is to be downshifted. If the source field is LSB aligned 
then it is upshifted in order to align it with the destination. 
Upshifting is accomplished by making the position inputs equal 
to the number of places the field Is to be upshifted. Any other 
type of field operation is not allowed. Whenever the field 
crosses the word boundary, the portion not falling within the 
word boundary is ignored. This effect is useful when perform- 
ing operations on fields that overlap two different words. 
Instructions to perform straightfonward multiple-bit shifts (ei- 
ther up or down) are also provided. Additionally, it is possible 
to extract a bit-field from a word in one instruction, even if that 
field overlaps a word boundary. 

The power and the flexibility of the processor comes partly 
from its ability to generate a mask to control the width of an 
operation for each instnjction without any overhead. For all 
byte aligned instructions (three quarters of the instruction set), 
the mask is either 1 , 2, 3 or 4 bytes wide and is generated from 
the byte width input (la - 17). For all field instaictions the mask 
is of variable width and is generated from the position inputs 
(Po - P5) and the width inputs (Wq - W4). Table 1 describes 
the position displacement from the position inputs and Table 2 
the bit field from the width inputs. 

TABLE 1. POSITION INPUTS AND BIT 
DISPLACEMENT 



Inputs 


Bit Displacement 

p 


P5 


P4 


P3 


P2 


Pi 


Po 








1 
1 

1 







1 




1 






1 




1 






1 




1 





1 
1 




1 




1 



1 



1 

1 



1 
2 

31 
-32 
-31 

-1 
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TABLE 2. WIDTH INPUTS AND BIT FIELD 



Inputs 


Bit Field 
w 


W4 


W3 


W2 


Wi 


Wo 




















1 




1 




32 
1 
2 


1 


1 


1 


1 


1 


31 



Whenever the width of the operand is less than 32-bits, all 
unselected bits from the inputs of the ALU are passed to the 
output without any modification. Depending upon the instruc- 
tion type, unselected bits are taken from different sources. For 
example in all single operand instructions, bits from the source 
operand {from either A or B input) are passed in unselected bit 
positions. For two operand instructions, bits from the B input 
are passed in unselected bit positions. There are some 
exceptions which are explained in the instruction set section. 

The processor has a 32-bit status register to indicate the 
status of different operations performed. The status register is 
loaded at the rising edge of the clock with new status unless 
the HOLD signal is HIGH. The bit position for each status bit is 
given in the functional description. The least significant byte of 
the status register holds the six position bits (PRq- PR5). The 
two most significant bits of this byte may be read or loaded but 
are otherwise unused by the ALU. The second byte (bits 8 to 
1 5) consists of the five width bits (WRq - WR4) and three read- 
only bits that are a combinational function of other status bits, 
and which indicate useful branch conditions. The third byte 
consists of ALU status bits plus bits for high-speed multiply 
and divide. The most significant byte holds intermediate nibble 
carries for BCD operations. An extract-status instruction is 
provided which allows a Boolean value to be formed from any 
selected bit. This is particularly useful in machines employing a 
stack architecture. Instructions to save and restore the status 
register are provided. As the entire status of each instruction is 
stored in the status register, inten-upts at any microinstruction 
boundary are feasible. 

The processor has a 32-bit wide priority encoder to support 
floating-point and graphics operations. The priority encoder 
supports all byte aligned data types - the result is dependent 
upon the byte width specified. The result of a priority encode is 
also loaded into the position bits of the status register. The 
result of the prioritize operation can then be used in the 
following clock cycle, e.g., to normalize a floating-point num- 
ber or to help detect the edge of a polygon in graphics 
applications. 

To support system diagnostics, the Am29332 has a special 
"Master-Slave" mode. To use this mode, two chips are 
connected in parallel, and hence receive the same instructions 
and data. The master chip is used for the normal data path. 
However, in the slave chip, all outputs becomes inputs. The 
slave compares the outputs of the master with its own 
internally generated result. If the two do not match, the slave 
will activate an en'or signal. 

As a further diagnostic aid, byte-wise parity checking is 
performed at both the A and B data inputs. The "parity" signal 
is activated if an error is detected. Parity bits (one per byte) are 
generated for the 32-bit output bus. 

FUNCTIONAL DESCRIPTION 

A detailed description of each functional block is given in the 
following paragraphs. 



64-Bit Funnel Shifter 

The 64-bit funnel shifter is a combinatorial network. The 64-bit 
input is formed from a combination of the A and B inputs. This 
may be left-shifted by up to 31 bits before being used by the 
ALU. The output of the shifter is the most significant 32 bits of 
the result. The 64-bit shifter can be used on either the A or B 
operands to perform barrel shifts (either up or down) or 
rotates. The operation is controlled by positioning operands 
properly at the input of the 64-bit up-shifter. 

The number "n" by which the operand is shifted comes from 
two sources: the microprogram memory via the Pq - P5 pins or 
the internal register (byte of the status register), PRq - PR5, 
as selected by an instruction bit. 

In general, the 6-bit position input, Pq - P5, takes a 6-bit two's 
complement number representing upshifts from to 31 places 
(positive numbers) or downshifts from 1 to 32 places (negative 
numbers). 

Mask Generator 

The mask generator logic provides the ability to generate the 
appropriate mask for an operand of given width and position. 
The generation of the mask depends upon two types of 
instnjctions. The first type has byte boundary aligned oper- 
ands (widths of either 1, 2, 3 or 4 bytes) with the least 
significant bit aligned to bit 0. The width of an operand is 
specified by the byte width inputs (Is and I7) as shown in Table 
3. The second type of instruction has operands of variable 
width (1 to 32 bits) and position. The operand is specified by 
the width inputs (Wo - W4) and the position inputs (Po - P5) 
indicating the least significant bit position of the operand. 
Thus, in this type of instruction the operand may or may not be 
least significant bit aligned. Depending upon the type of 
instruction, the mask generator first generates a fence of all 
zeros starting from the least significant bit with the width 
specified either by the byte width or the width input fields. This 
fence can be upshifted by up to 31 bits by the 32-bit mask 
shifter. Whenever the mask is moved up over the 32-bit 
boundary, it does not wrap around, instead, ONE'S are 
inserted from the least significant end. This configuration 
provides the ability to operate on a contiguous field located 
anywhere in a word, or across a word boundary. 

The mask generator can be used as a pattern generator by 
allowing the mask to pass through ALU (by using the PASS- 
liilASK instruction). For example, a single-bit wide mask can be 
generated and by shifting it up by different amounts can give 
walking ONE or walking ZERO patterns for memory tests. 

TABLE 3. 



Is 


I7 


Width In Bytes 








4 





1 


1 


1 





2 


1 


1 


3 



Arithmetic and Logical Unit 

The ALU is a three Input unit which uses the mask as a second 
or third operand in every instruction. The mask is used to 
merge two operands. For all selected bits (wherever the mask 
is 0), the desired operation specified by the instruction input is 
performed, and for all unselected bits either corresponding 
destination bits or zeros are passed through. The status of 
each operation (carry, negative, zero, overflow, link) applies to 
the result only over the specified width. For all byte aligned 
arithmetic and logical operafions (first three quarters of the 
instruction set), the status is extracted from the appropriate 
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byte boundary. For all field operations (last quarter of the 
instruction set), the operand width is assumed to be 32 bits for 
status generation. The ZERO flag always indicates the status 
of all bits selected by the mask. 

The actual width of the ALU is 34 bits. There are two extra bits 
used for the high speed signed and unsigned multiplication 
instructions. These two bits are automatically concatenated to 
the most-significant end of the ALU depending upon the width 
specified for the operation. Since the modified Booth algorithm 
requires a two-bit down-shift each cycle, these ALU bits 
generate the two most-significant bits of the partial product. 

The ALU is capable of shifting data down by two bits for the 
multiplication algorithm, up by one bit for the divide algorithm 
and single-bit-up-shifts. 

The processor is capable of performing BCD arithmetic on 
packed BCD numbers. The ALU has separate carry logic for 
BCD operations. This logic generates nibble carries (BCD digit 
carry) from propagate and generate signals formed from the A 
and B operands. In order to simplify the hardware while 
maintaining throughput, the BCD add and subtract operations 
are performed in two cycles. In the first cycle, ordinary binary 
addition or subtraction is performed and BCD nibble carries 
are generated. These are blocked from affecting the result at 
this stage, but are saved in the status register to be used later 
for BCD correction (NCq - NC7). In the second cycle all BCD 
numbers are adjusted by examining the previously generated 
nibble carries. Since all the necessary information is stored in 
the status register, the processor can be interrupted after the 
first BCD cycle. 

Priority Encoder 

The priority encoder is provided to support floating-point 
arithmetic and some graphics primitives. The priority encoder 
takes up to 32 bits as input and generates a 5-bit wide binary 
code to indicate location of the most significant one in the 
operand. Input to the priority encoder comes from the input 
multiplexer, which masks all bits that the user does not want to 
participate in the prioritization. The priority encoder supports 8, 
16, 24 and 32-bit operations depending upon the byte width 
specified. For each data type the priority encoder generates 
the appropriate binary weighted code. For example, when a 
byte width of two is specified (I7 - la = 10), the output of the 
encoder is zero when bit 1 5 is HIGH. However, if byte width of 
four is specified (Is -17 = 00), the output of encoder is 16 
(decimal) if bit 15 is HIGH and bits 31 - 16 are LOW. Table 4 
shows the output for each data type. If none of the inputs are 
HIGH or the most significant bit of the data type specified is 
HIGH, then the output is zero. The difference between these 
two cases is indicated by the Z-f lag of the status register which 
is HIGH only if all inputs are zero. 

Q-Register 

The Q-register holds dividend and quotient bits for division, 
and multiplier and product bits for multiplication. During 
division, the contents of the Q-register are shifted left, a bit at 
a time, with quotient bits inserted into bit 0. During multiplica- 
tion, the contents of the Q-register are shitted right, two bits at 



a time, with product bits inserted into the most-significant two 
bits (according to the selected byte width). The Q-register may 
be loaded from the A or B inputs and read onto the Y bus. 

Master-Slave Comparator 

All ALU outputs (except MSERR) employ three-state buffers. 
The master-slave comparator compares the input and output 
of each buffer. Any difference causes the MSERR signal to be 
made true. In Slave mode, all output buffers are disabled. 
Outputs from a second ALU may then be connected to the 
equivalent pins of the first. The comparator in the slave will 
then detect any difference in the results generated by the two. 
When the Y bus is three-stated by making Output-Enable 
false, the Y bus master-slave comparators are disabled. 

Parity Logic 

For each byte of the DA and DB inputs there is an associated 
parity bit (8 in all). If a parity enor is detected on any byte, the 
Parity-Error signal is made tnje. Four parity signals (one per 
byte) are also generated for the Y bus outputs. EVEN parity is 
employed for the Am29332. 

Status Register 

All necessary information about operations performed in the 
ALU is stored in the 32-bit wide status register after every 
microcycle. Since the register can be saved, an interrupt can 
occur after any cycle. The status register can be loaded from 
either the A or B input of the chip and can be read out on the Y 
bus for saving in an external register file. For loading, the byte 
width indicates how many bytes are to be updated. The status 
register is only updated if the HOLD input is inactive. 

Each byte of the status register holds different types of 
information (see Figure 3). The least significant byte (bits to 
7) holds eight position bits (PRq - PR7) for the data shifter. 
The two most significant bits are not used. The next most 
significant byte (bits 8 to 15) holds the 5-bit width field 
(WR0-WR4) for the mask generator. The three most-signifi- 
cant bits of that byte (bits 13 to 15) are read-only bits that 
represent three different conditions extracted from the other 
bits of the status register. They are C + Z, N e V, and (N © 
V) + Z foi bits 13, 14 and 15 respectively. These bits can be 
read on the Yo pin by the extract-status instruction. The next 
byte contains all the necessary information generated by an 
ALU operation. The least-significant four bits (bits 16 to 19) 
hold carry, negative, overflow and zero flags. Bit 20 holds link 
information for single bit shifts and bits 21 and 22 are used by 
the multiply and divide instructions. The M flag holds the 
multiplier bit for the modified Booth algorithm or it holds the 
sign comparison result for the divide algorithm. The S flag 
holds the sign of the partial remainder for unsigned division. 
Both the flags (M and S) are provided as a part of the status 
register so that multiply and divide instructions can be inter- 
rupted at microinstruction boundaries. The most significant 
byte of the status register holds nibble carries for BCD 
arithmetic. Since BCD arithmetic is performed in two cycles, 
the nibble carries are saved in the first cycle and used in the 
second cycle. Since all the information is stored, BCD instruc- 
tions are also interruptible at the microinstruction boundary. 
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TABLE 4. 



Statuso-?: 



Position Reaister 



Highest Priority 
Active Bit 



l7-l8 = 00 (32-bit) 
None 
31 
30 
29 
28 



l7-l8 = 01 (8-bit) 
None 
7 
6 
5 



l7-l8 = 10 (16-bit) 
None 
15 
14 
13 
12 



17-18 = 11 (24-bit) 
None 
23 
22 
21 
20 



Encoder 
Output 



30 
31 



14 
15 



22 

23 



PR7 


PRs 


PR5 


PR4 


PRa 


PR2 


PRi 


PRo 



StatuS8-i2: 
Statusis: 
Statusu: 
Statusis: 



Width Register 
C-hZ 

Nev 
(N e V) + z 



Read Only 



SIGNED 
LE 


SIGNED 
LT 


UNSIGNED 
LE 


WR4 


WR3 


WRj 


WRi 


WRo 



14 



15 

Status-ie: 
Statusi7: 
Statusis: 
Statusi9: 
StatuS2o: 
StatuS2i: 
StatuS22: 
Status23: 



13 



12 



11 



10 



Carry 

Negative 

Overflow 

Zero 

Link 

Multiply (and divide) Bit 

Sign Flag 








S 


M 


L 


Z 


V 


N 


C 


23 22 
StatUS24_3i: 


21 20 19 

. Nibble Carries 


18 


17 


16 


NC7 


NC6 


NC5 


NC4 


NC3 


NC2 


NCi 


NCo 



Note: Overflow Is defined as follows: 

V = (carry in to MSB) ® {cany out of MSB) 

Figure 3. ALU Status Register Bit Assignment 
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Am29332 INSTRUCTION SET 
Data Types 

The Am29332 supports the following data types: 

1. Integer 

2. Binary-coded decimal 

3. Variable-length bit field 

The first two data types fall into the category of byte boundary 
aligned operands (Figure 4). The size of the operand could be 
1 byte, 2 bytes, 3 bytes or 4 bytes. All operands are least 
significant bit (bit 0) aligned. The byte width is determined by 
bits Is and I7 of the instruction as shown in Table 5. 

TABLE 5. 



TABLE 6. 



l8 


I7 


Width in 
Bytes 








4 





1 


1 


1 





2 


1 


1 


3 



The third data type has operands of variable width (1 to 32 
bits) as shown in Figure 4. The operand is specified by width 
inputs (W0-W4) and position inputs (P0-P5). The position 
inputs indicate the least significant bit position of the operand. 
Depending on bits \s and I7 of the instruction, the width and 
position inputs can be selected from either the Status Register 
or the Width and Position Pins as shown in Table 6. A 
summary of the data types available is illustrated in Table 7. 



la 


I7 


Position 


Width 


Pins 


Reg 


Pins 


Reg 








X 




X 







1 


X 






X 


1 







X 


X 




1 


1 




X 




X 



TABLE 7. 



Data Type 


Size 


Range 


Integer 




Signed Unsigned 


1 byte 


8 bits 


-128 to +127 to 255 


2 t)ytes 


16 bits 


-2^5 ,0 to 
+ 215_1 2l6_i 


3 bytes 


24 bits 


-223 to 223-1 to 

o24 ^ 


4 bytes 


32 bits 


-231 ,0 231 - .| ,Q 

232-1 


BCD 


1 to 4 bytes 


Numeric, 2 digits per byte. 




(8 digits) 


Most-significant digit may be 
used tor sign. 


Variable 


1 to 32 bits 


Dependent on position and 
widtti inputs. 



Instruction Format 

The Am29332 has two types of Instniction Formats: 
1. Byte Boundaty Aligned Instructions (FORMAT 1): 



31 




23 




15 




7 


M 


W"/. 


w 


W/, 




Wa 






'9^y. 


i 


W- 







"222^7///, 



















TBOOOOSe 

Byte Boundary Aligned Operands 



p p-1 



mmmm y/m 



TB000630 

Variable-Length Bit Field 

p == Bit displacement of the least significant field with re- 
spect to bit 0. 
w = Width of bit field. 

Figure 4. Data Types 



TB000098 

2. Variable-Length Field Bit Instructions (FORMAT 2): 



<■ 


"7 


>I 








"0 


p/™ 


W/WR 


OPCODE 


10 




« 5 







WIDTH 


posmON 



TB000099 

For instructions that allow a field to be shifted up or down, 
P0-P5 is a two's-complement number in the range -32 to 
+ 31 representing the direction and magnitude of the shift. For 
instructions that assume a fixed field position, Pq - P4 repre- 
sent the position of the least-significant bit of the field and P5 
is ignored. 
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Instruction Classification 

ALU instructions can tie ciassrfied as follows: 

A. Byte Boundary Aligned Operand Instructions: 

1. Arithmetic 

- Binary, BCD 

- Multiply steps 

- Division steps (single and multiple precision) 

2. Prioritize 

3. Logical 

4. Single-bit shifts 

5. Data movement 

B. Variable-Length Bit Field Operand Instnjctions: 

1. N-bit shifts and rotates 

2. Bit manipulations 

3. Field logical operations (aligned, non-aligned, extract) 

4. Mask generation 

Three-fourths of the ALU instructions apply to operands that 
are byte boundary aligned. For these instructions, two orthog- 
onal issues are the width of the operand (in bytes) and the 
contents of the high order unselected bytes on the Y bus. As 
mentioned eariier, the width of the operand is specified by Ig 
and I7. With the exception of a few instructions, the unselected 
bytes are assigned values as follows: for single operand 
instructions, unselected bytes are passed unchanged from the 
source (A or B). For two operand instructions, unselected 
bytes are passed unchanged from the destination (B input). 

In the last quarter of the instruction set, the width of the 
operand is from 1 to 32 bits (based on the width input) for field 
operations, 32 bits for N-bit shift operations and 1-bit for bit- 
oriented operations. In the case of field-aligned and single-bit 
operands, the position bits (P0-P4) detemiine the least 
significant bit of the operand. In the case of N-bit shifts and 
field non-aligned operands, the position bits Pq - P5 is a 6-bit 
signed integer determining the magnitude and direction of the 
shift. 

Flags 

Byte-Aligned Instructions 

The zero flag always looks only at the selected bytes: 
Z *- (Y and bytemask (byte width) = 0) 



Similarly, N *- sign bit (Y, byte widtti), where the function 
"sign-bit" returns bit 7, 15, 23, or 31 of the first argument for 
byte widths 01, 10, 11, or 00 respectively. 

Also, C ^ carry (byte width) returns the carry from the 
appropriate byte boundary, and: 

V '- overflow (byte width) = (carry into MSB) ® (carry 
out of MSB) 

returns the overflow from the appropriate byte tioundary. 

The link (L) flag is generally loaded with the bit moved out of 
the highest selected byte in the case of upshifts, or the bit 
moved out of the least significant byte for downshifts. Figure 5 
shows the shift operation using link bit. Other status flags have 
specialized uses, explained in the following sections. 

Shift Down: 



» 


M 
U 
X 




—1,2, 3, or 4 bytes — ► 






-4 


i A(orB) 


» 


L 


n 


r 


Sign bit 












1 



Shift with sign bit fil imF^ments arithmetic shift. 



Shift Up: 


1,2, 3, or 4 bytes 




M 
U 
X 


^ 




L 




A(orB) 


• 

















DF006190 

Figure 5. Upsliift/Downshift Using Link Bit 
Variable-Length Field Instruction: 

Generally, only N and Z are affected. N takes the most- 
significant bit of the 32-bit result (i.e., N ^ Y31). Z detects 
zeros in the selected field of the result (i.e., Z ^ (Y and 
bitmask (position, width) = 0)). 

Output Select 

The Register Status pin, RS, may be used to switch the C, Z, 
N, V, and L output pins between the direct output of the ALU 
and the outputs of the corresponding bits in the status register. 
If the direct status output is selected, then for instnjctions that 
do not affect a particular flag (e.g., canry for logical arithmetic) 
that output will reflect the state of its corresponding bit in the 
status register. Similariy, when the HOLD signal is made 
HIGH, the C, Z, N, V and L pins will be made equal to the 
contents of the status register, regardless of the RS input. 
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INSTRUCTION SET SUMMARY 



Operantl Size: Variable Byte Width: 1, 2, 3, 4 Bytes 



Type 



Arithmetic 



Prioritize 



Logical 



Single-Bit 
Shifts 



Data 
Movement 



Operation 



• Increment by one, two, four 

• [decrement by one, two, tour 

• Add, addo (carry - macro/micro) 

• Sub, subr 

• Subc, subrc (carry/borrow) 

• BCD sum and difference correct steps 



• Negate (two's complement) 

• Multiply steps (modified Booth) 

• Divide steps (non-restoring) 



(Signed and unsigned) 



Prioritize 



• Not, OR, AND, XOR, XNOR, zero, sign 



• Upshift with 0, 1, linl< fill 

• Downshift with 0, 1, link, sign 



j (Single and double precision) 



• Zero extend 

• Sign extend 

• Pass-status, Q-Reg 

• Load-status, Q-Reg 

• Merge 



Data Type 



Binary Integer 
and BCD 



Binary Integer 



Binary 



Binary 



Binary 



Binary 



Operand Size: 32 Bits 


Type 


Operation 


Data Type 


N-Bit Shifts 

N-Bit Rotates 


• Upshift by to 31 bits with fill 

• Downshift by 1 to 32 bits with 0, sign fill 

• Rotate by to 31 bits 


Binary 



Operand Size: Single Bit 



Type 



Bit 
Manipulation 



• Extract 

• Set 

• Reset 



Operation 



Data Type 



Binary 



Operand Size: Variable Length Bitfield: 1 to 32 Bits 



Type 



Field Logical 
(aligned and 
non-aligned) 



Mask 



Operation 



• Not, OR, XOR, AND, extract, insert 



Pass-mask 



Data Type 



Binary 



Binary 
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INSTRUCTION SET GLOSSARY 










(Sorted by Opcode In Hex Notation) 






Opcode 


Name 


Opcode 


Name 


Opcode 


Name 


Opcode 


Name 


00 


ZERO-EXTA 


20 


DN1-0F-A 


40 


AND 


60 


NB-SN-SHA 


01 


ZERO-EXTB 


21 


DN1-0F-B 


41 


XNOR 


61 


NB-SN-SHB 


02 


SIGN-EXTA 


22 


DN1-0F-AQ 


42 


ADD 


62 


NB-OF-SHA 


03 


SIGN-EXTB 


23 


DN1-0F-BQ 


43 


ADDC 


63 


NB-OF-SHB 


04 


PASS-STAT 


24 


DN1-1F-A 


44 


SUB 


64 


NBROT-A 


05 


PASS-Q 


25 


DN1-1F-B 


45 


SUBC 


65 


NBROT-B 


06 


LOADQ-A 


26 


DN1-1F-AQ 


46 


SUBR 


66 


EXTBIT-A 


07 


LOADQ-B 


27 


0N1-1F-BQ 


47 


SUBRC 


67 


EXTBIT-B 


08 


NOT-A 


28 


DN1-LF-A 


48 


SUM-CORR-A 


68 


SETBIT-A 


09 


NOT-B 


29 


DN1-LF-B 


49 


SUM-CORR-B 


69 


SETBIT-B 


OA 


NEG-A 


2A 


DN1-LF-AQ 


4A 


DIFF-CORR-A 


6A 


RSTBIT-A 


OB 


NEQ-B 


2B 


DN1-LF-BQ 


4B 


DIFF-CORR-B 


6B 


RSTBIT-B 


OC 


PRIOR-A 


2C 


DN1-AR-A 


4C 


. 


6C 


SETBIT-STAT 


OD 


PRIOR-B 


2D 


DN1-AR-B 


4D 


- 


6D 


RSTBIT-STAT 


OE 


MERGEA-B 


2E 


DN1-AR-AQ 


4E 


SDIVFIRST 


6E 


NOTF-AL-B 


OF 


MERGEB-A 


2F 


DN1-AR-BQ 


4F 


UDIVFIRST 


6F 


PASSF-AL-B 


10 


DECR-A 


30 


UP1-0F-A 


50 


SDIVSTEP 


70 


NOTF-A 


11 


DECR-B 


31 


UP1-0F-B 


51 


SDIVLAST1 


71 


NOTF-AL-A 


12 


INCR-A 


32 


UP1-0F-AQ 


52 


MPDIVSTEP1 


72 


PASSF-A 


13 


INCR-B 


33 


UP1-0F-BQ 


53 


MPSDIVSTEP3 


73 


PASSF-AL-A 


14 


DECR2-A 


34 


UP1-1F-A 


54 


UDIVSTEP 


74 


ORF-A 


15 


DECR2-B 


35 


UP1-1F-B 


55 


UDIVLAST 


75 


ORF-AL-A 


16 


INCR2-A 


36 


UP1-1F-AQ 


56 


MPDIVSTEP2 


76 


XORF-A 


17 


INCR2-B 


37 


UP1-1F-BQ 


57 


MPUDIVSTP3 


77 


XORF-AL-A 


18 


DECR4-A 


38 


UP1-LF-A 


58 


REMCORR 


78 


ANDF-A 


19 


DECR4-B 


39 


UP1-LF-B 


59 


QUOCORR 


79 


ANDF-AL-A 


1A 


INCR4-A 


3A 


UP1-LF-AQ 


5A 


SDIVLAST2 


7A 


EXTF-A 


1B 


INCR4-B 


3B 


UP1-LF-BQ 


5B 


UMULFIRST 


7B 


EXTF-B 


1C 


LDSTAT-A 


3C 


ZERO 


5C 


UMULSTEP 


7C 


EXTF-AB 


ID 


LDSTAT-B 


3D 


SIGN 


5D 


UMULLAST 


7D 


EXTF-BA 


1E 




3E 


OR 


5E 


SMULSTEP 


7E 


EXTBIT-STAT 


IF 


- 


3F 


XOR 


5F 


SMULFIRST 


7F 


PASS-MASK 
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TABLE 6-1. DATA MOVEMENT INSTRUCTIONS 



Mnemonics 


Code 


Description 


Y Output 


Status 


Unsel 


Sel 


S 


M 


L 


Z 


V 


N 


C 


ZERO-EXTA 


00 


Zero Extend 





A 












• 




ZERO-EXTB 


01 







B 












• 




SIGN-EXTA 


02 


Sign Extend 


Sign 


A 












* 




SIGN-EXTB 


03 




Sign 


B 












* 




MERGEA-B 


OE 


Merge A with B 


B 


A Merge B 












. 




MERGEB-A 


OF 


Merge B with A 


A 


B Merge A 












• 









TABLE 6-2. DATA MOVEMENT INSTRUCTIONS 
















Mnemonics 


Code 


Description 


Y Output 


Status Register 


Status 


Unsel 


Sel 


S 


M 


L 


Z 


V 


N 


C 


PASS-STAT 


04 


Pass Status Register 


B 


S 
















+ 


LDSTAT-A 


1C 


Load Status Register 


S 


A 


A 


+ 


+ 


+ 


+ 


+ 


+ 


LDSTAT-B 


1D 




S 


B 


B 


+ 


+ 


+ 


+ 


+ 


+ 


+ 







TABLE 6-3. DATA MOVEMENT INSTRUCTIONS 
















Mnemonics 


Code 


Description 


Y Output 


Q Register 


Status 


Unsel 


Sel 


S 


M 


L 


Z 


V 


N 


C 


PASS-Q 


05 


Pass Q Register 


B 


Q 


















LOADQ-A 


06 


Load Q 


Q 


A 


A 








• 




« 




LOADQ-B 


07 




Q 


B 


B 








* 




• 





Legend: 



Examples: 



Unsel = Unselected Byte(s) 
Sel -Selected Byte(s) 
A = A Input 
B-B Input 
Q = Q Register 

+ - Updated only if byte width is 3 or 4 
* = Updated 



2, ZERO EXTB 
0. LOADO-A 



Pass lower two bytes of B to Y with zero fill on upper two bytes 
Load all four bytes of A into Q Register pass updated Q Resistor to Y 
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TABLE 7. LOGICAL INSTRUCTIONS 



Mnemonics 



NOT-A 



NOT-B 



ZERO 



SIGN 



OR 



XOR 



AND 



Code 



08 



Description 



09 



3C 



3D 



3E 



3F 



XNOR 



40 



41 



One's Complement 



Y Output 



Unsel 



Pass Zero 



Pass Sign 



OR 



EXOR 



AND 



XNOR 



Sel 



Status 



0(N = 0); -1(N = 1) 



A OR B 



A XOR B 



A AND B 



A XNOR B 



Note: 1. These instructions use the byte aligned instruction format (FORMAT 1). 



Legend: 



Examples: 



Unsel = Unselected Byte(s) 
Sel = Selected Byte(s) 
A = A Input 
B = B Input 
Q = Register 
" = Updated 

2, NOT-A 



1, AND 



Complement low order two bytes of A and output to Y with 
high order two bytes of A uncomplemented. 

AND first byte of A and B. Output to Y with high three 
bytes of B. 



TABLE 8-1. SINGLE-BIT SHIFT INSTRUCTIONS (SINGLE 


PRECISION) 








(Mnemonics 


Code 


Description 


Y Output 


Status 


Unsel 


Sel 


S 


M 


L 


Z 


V 


N 


C 


DN1-0F-A 


20 


Downshift, Zero Fill 


A 


Yi = Ai + i, Ymsb = 














DN1-0F-B 


21 


B 


Yi = Bi + i, Ymsb = 
















DN1-1F-A 


24 


Downshift, One Fill 


A 


Yi = Ai + i. Ymsb = 1 
















DN1-1F-B 


25 


6 


Y| = Bi + i, Ymsb = 1 
















DN1-LF-A 


28 


Downshift, Lini( Fill 


A 


Yi = Ai + 1, Ymsb= L 
















DN1-LF-B 


29 


B 


Yi = Bi + i, Ymsb = l- 
















DN1-AR-A 


2C 


Downshift, Sign Fill 


A 


Yi = Ai + i, Ymsb = N 
















DN1-AR-B 


2D 


B 


Yi = Bi+i, Ymsb = N 
















UP1-0F-A 


30 


Upshift, Zero Fill 


A 


Yi = Ai.i, Yo = 
















UP1-0F-B 


31 


B 


Yi = Bi.i, Yo = 
















UP1-1F-A 


34 


Upshift, One Fill 


A 


Yi = Ai.i, Yo = 1 
















UP1-1F-B 


35 


B 


Yi = Bi.i, Yo = 1 
















UP1-LF-A 


38 


Upshift, Linl^ Fill 


A 


Yi = Ai-i, Yo = L 
















UP1-LF-B 


39 


B 


Yi = Bi-i, Yo = L 


— 














Note: 1. These instructions use the byte aligned instruction format (FORMAT 1) 

*'^''^' 2, UP1-1F-A Shift lower two bytes of A up one bit. Set LSB to 1. Fill 

unselected bytes to upper two bytes of A. 



3-53 



Mnemonics 



DN1-0F-AQ 



DN1-0F-BQ 



DN1-1F-AQ 



DN1-1F-BQ 



DN1-LF-AQ 



DN1-LF-BQ 



DN1-AR-AQ 



DN1-AR-BQ 



UP1-0F-AQ 



UP1-0F-BQ 



UP1-1F-AQ 



UP1-1F-BQ 



UP1-LF-AQ 



UP1-LF-BQ 



TABLE 8-2. SINGLE-BIT SHIFT INSTRUCTIONS (DOUBLE PRECISION) 



Code 



22 



23 



26 



27 



2A 



2B 



2E 



2F 



32 



33 



Description 



Downshift, Zero Fill 



Downshift, One Fill 



Downshift, Linl< Fill 



Downshift, Sign Fill 



Upshift, Zero Fill 



36 



37 



3A 



3B 



Upshift, One Fill 



Upshift, Link Fill 



Y Output & Q Register 



Selected Bytes 



B 



Q 



Notes: 1. These instructions use the byte aligned instruction format (FORMAT 1) 

2. Y Unselected byte from A, Unseleoted byte unchanged 

3. Y Unselected byte from B, Q Unselected byte unchanged. 

Legend: Unsel - Unselected Byte(s) 
Sel - Selected Byte{s) 
A = A Input 
B-B Input 
Q = Q Register 
* - Updated 



2) 



3) 



Q 2) 



Q 3) 



Q 3) 



2) 



Q 3) 



2) 



Status 



M 



Example: 



0, DN1-AR-B0 



^l 



B (32 bits) 



sign bit 



Shift 64 bits (all 32 bits of both B and Q) 
down by one bit. LSB of B fills MSB of Q. 
MSB of B set to sign bit (bit N of status register). 



>*c 



Q (32 bits) 




link status bit 



3, UP1-LF-A0 



Shift 48 bits (24-bits of A and 24-bits of Q) 
up by one bit. MSB of 24-bit Q fills LSB of A. 
MSB of 24-bit A sets link status bit. LSB of 
Q is filled with original link value. 



K:^A(24bits~ 



|;j:^ Q(24b"itir 



]^ 



DF006200 
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TABLE 9. PRIORITIZE INSTRUCTIONS 



Mnemonics 



PRIOR-A 



PRIOR-B 



Code 



oc 



OD 



Description 



Prioritization 



Y Output 



Location of Highest 1 Bit 



Notes: 1. These instructions use the byte aligned instruction format (FORMAT 1). 

2. Priority also loaded into STATUS <7:0> 

3. Refer to Table 4. 







Status 






s 


M 


L 


Z 


V 


N 


C 








* 














• 









Legend; A = A Input 
B = B Input 
Q = Q Register 
* = Updated 
Example: . 

3, PRIOn-A Value placed on Y is 2 



Assume A is 



Example: 



t 



loTi I 00100010 I 00000000 I 00000000~] 





TABLE 10-1. ARITHMETIC INSTRUCTION 


s 














Mnemonics 


Code 


Description 


Y Output 


status I 


Unsel 


Sei 


S 


M 


L 


Z 


V 


N 


C 


NEG-A 


OA 


Two's Complement 


A 


A + 1 








* 


* 




— 


NEG-B 


OB 


B 


B + 1 








* 


* 




INCR-A 


12 


Increment by One 


A 


A + 1 








* 


* 






INCR-B 


13 


B 


B + 1 








* 


* 






INCR2-A 


16 


Increment by Two 


A 


A + 2 








* 


* 






INCR2-B 


17 


B 


B + 2 








* 


* 






1NCR4-A 


1A 


Increment by Four 


A 


A + 4 










* 






INCR4-B 


IB 


B 


B + 4 










* 






DECR-A 


10 


Decrement by One 


A 


A-1 










* 






DECR-B 


11 


B 


B-1 










* 






DECR2-A 


14 


Decrement by Two 


A 


A-2 










* 






DECR2-B 


15 


B 


B-2 










* 






DECR4-A 


18 


Decrement by Four 


A 


A-4 










* 






DECR4-B 


19 


B 


B-4 


















Notes- 1 These instructions use the byte aligned instruction format (FORMAT 1). 

' 2 Borrow rather than carry, is generated if BOROW is HIGH (bon-ow = carry). 
3; Nibble bits are set by these instructions. NEG-A (or NEG-B) and DIFF-CORR "^^y ^1"^" '0 
form 10's complement of a BCD number. Use SUM-CORR (for increment) or DIFF-CORR (for 
decrement) to increment or decrement a BCD number. 

Legend: Unsel = Unselected Byte(s) 
Sel = Selected Byte(s) 
A = A Input 
B = B Input 
Q = Q Register 
' = Updated 



DECR4-A 



Decrement lower two bytes of A by 4 
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TABLE 10-2. ARITHMETIC INSTRUCTIONS 



Mnemonics 



ADD 



ADDC 



SUB 



SUBR 



SUBC 



SUBRC 



SUM-CORR-A 



SUM-CORR-B 



DIFF-CORR-A 



Code 



42 



43 



44 



46 



45 



47 



48 



49 



Description 



Add 



Add with Carry 



Y Output 



Unsel 



B 



Subtract 



Subtract with Carry 



Correct BCD Nibbles 
for Addition 



DIFF-CORR-B 



4A 



4B 



Con-ect BCD Nibbles 
tor Subtraction 



Sel 



A + B 



A + B + C 



6) 



B 



A + B + 1 



B + A + 1 



A + B + 1 + C 2) 6) 



B + A + 1 + C 2) 6) 



Corrected A 



3) 



Corrected B 



Corrected A 



3) 



Status 



M 



Corrected B 



3) 



Notes: 1. These instructions use the byte aligned instruction format (FORMAT 1) 

2. BOROW IS LOW. For subtract operations, a bon-ow rather than a carry is stored in STATUS if BOROW is HIGH 
Carry IS always generated for ADD regardless of BOROW ■muo ii oumuwv is Hit.H. 

^' 7nJ^,!!*'^^ ^'"'fJ:!'^'.^^' ''^ '^''^<^- '^'^ "'""« carry/borrow that is set to 1 generates "e" internally as 

4. Use SUM-CORR or DIFF-CORR to add or subtract a BCD number 

5. Use ADDC, SUBC, or SUBRC to perform operations on integers longer than 32 bits 

6. Can, bit IS obtained from H^Cin if M/m is HIGH. Othen«ise, carry is obtained from the C status bit 
Legend: Unsel = Unselected Byte(s) 

Sel = Selected Byte(s) 
A = A Input 
B = B Input 
Q = Q Register 



Example: 



* = Updated only if byte width is 3 or 4 

Add two 32-bit two's-complement integers 



ADD 
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TABLE 11-1. DIVIDE INSTRUCTIONS (Aligned Format) 



Name 



l6-lo 
Code 



Description 



Source for 

Unselected 

Bytes 



Output 



Status 



S M L Z V N C 



Signed Divide Steps 



SDIVFIRST 



4 E 



First Instruction for Signed Divide 



Y, Q 



SDIVSTEP 



5 



Iterate Step (#bits - 1 times) 



Y, Q 



SDIVLAST1 



5 1 



Last Divide Instruction Unless 



Y, Q 



SDIVI-AST2 



5 A 



Dividend & Remainder Negative 



Unsigned Divide Steps 



UDIVFIRST 



4 F 



First Instruction for Unsigned Divide 



Y, Q 



UDIVSTEP 



5 4 



Iterate Step (#bits - 1 times) 



Y, Q 



UDIVUST 



5 5 



Last Instructton 



Y, Q 



Multiprecision Divide Steps 



MPDIVSTEP1 



5 2 



First Instruction 



Y, Q 



MPDIVSTEP2 



Executed Times for Double 



Y, Q 



MPSDIVSTEP3 



5 3 



Last Instoiction of Inner Loop 



Y, Q 



MPUDIVSTP3 



5 7 



Used for Unsigned Divide 



Y, Q 



Correction Steps 



REMCORR 



5 8 



Correct Remainder After Divide 



QUOCORR 



Con-eot Quotient After Divide 



TABLE 11-2. EXAMPLE CODING FORM (Signed Division) 



Am29331 



Am29332 



Am29334 



E 
< 



OP 



Branch 



Cond 
Select 



Multi 
Sel 



B/W 



OP 



Width 



Position 



A-IN 



B-IN 



Y-OUT 



OE 

1 





1 








CONT 



LOADQ-A 



R2 



CONT 



SIGN 



R3 



FOR_D 



15 



SDIVFIRST 



R3 



R3 



DJMP_S 



SDIVSTEP 



R4 



R3 



R3 



CONT 



SDIVLAST1 



R4 



R3 



BRCC D 



DONE 



CONT 



SDIVLAST2A 



R4 



R3 



R3 



CONT 



PASS-Q 



R1 



CONT 



QUOCORR 



R1 



CONT 



REMCORR 



R4 



Note; Divisor in A, Dividend in A 

Quotient in Q, Remainder in B 

Legend: A = A Input 
B - B Input 
S = Status Register 
Q = Q Register 

R1 = Quotient 

R2 = Dividend 

R3 = Remainder 

R4 = Divisor 
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TABLE 12-1. MULTIPLY INSTRUCTIONS (Aligned Format) 


Name 


l6-l0 
Code 


Description 


Source lor 

Unselected 

Bytes 


Output 


Status 


S 


M 


L 


z 


V 


N 


c 


Signed Multiply Steps { 


SMULFIRST 


5 F 


First multiply instruction 


B 


yd) 
















SMULSTEP 


5 E 


Iterate step (# bits/2 - 1 steps) 


B 


Yd) 
















Unsigned Multiply Steps { 


UMULFIRST 


5 B 


First multiply instruction 


B 


yd) 




• 












UMULSTEP 


5 C 


Iterate step (# bits/2 - 1 steps) 


B 


yd) 




* 












UMULLAST 


5 


Last multiply instruction 


B 


yd) 








* 








TABLE 12-2. EXAMPLE CODING FORM (Unsigned Multiply) 


Am29331 


Am29332 


Am29334 


M 

1 


OP 


Branch 


Cond 
Select 


Multl 
Sel 


B/W 


OP 


Width 


Position 


A-IN 


B-IN 


Y-OUT 


51 


CONT 








3 


ZERO 








R3 


R3 





CONT 








3 


LOADO-A 






R1 






1 


FOR_D 


11lO 






3 


ULMULFIRST 






R2 


R3 


R3 





DJMP_S 








3 


UMULSTEP 






R2 


R3 


R3 





CONT 








3 


UMULLAST 






R2 


R3 


R3 





CONT 








3 


PASS-O 










R4 





Note: 1. Put ALU output in B. 

2. Multiplicand In A, Multiplier in Q 

Product (HIGH) in B, Product (LOW) in Q 

Legend: A = A Input 

B = B Input 

S - Status Register 

Q = Q Register 
R1 = Multiplier 
R2 - Multiplicand 
R3 = Product (HIGH) 
R4 = Product (LOW) 
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TABLE 13. SHIFT/ROTATE INSTRUCTIONS 



Mnemonics 


Code 


Description 


Y Output 


Status 


S 


M 


L 


Z 


V 


N 


C 


NB-OF-SHA 


62 


Field Shift, Zero Fill 


Yi + p = Ai, 2) 












* 




NB-OF-SHB 


63 


Yi + p = Bi, 2) 












* 




NB-SN-SHA 


60 


Field Shift, Sign Fill 


YiH.p = Ai, N - 2) 












* 




NB-SN-SHB 


61 


Yi + p = Bi, N 2) 












* 




NBROT-A 


64 


Field Rotate 


Yi = A{i-p)mod32 3) 












* 




NBROT-B 


65 


Yi = B(i_p)mod32 3) 












11 





Notes: 1. These instructions use the field instruction format (FORMAT 2). 

2. "p" stands for bit displacement from P0-P5 or from PR0-PR5 (-32<p<31). 
If p is positive, Yp_i to Yo are equal to the fill bit. 

if p is negative, Vsi to Ys^ + p + 1 are equal to the fill bit. 

3. The sign of the position input is ignored for this Instruction and Pq - P4 are treated as a positive magnitude for a 
circular upshift. 

Legend: A = A Input 
B = B Input 
Q-Q Register 
* = Updated 

Examples: ' 

NB-0F-SHA„4 Shift A up 4 bits and zero fill 

NB-0F-SHB„-17 Shift B down 17 bits and sign fill 

*Width field not used 

TABLE 14-1. BIT-MANIPULATION INSTRUCTIONS 



Mnemonics 


Code 


Description 


Y Output 


Status 


Unsei 


Set 


8 


lU 


L 


Z 


V 


N 


C 


SETBIT-A 


68 


Bit Set 


A 


Yi = Ai, Yp = 1 












* 




SETBIT-B 


69 


B 


Yi-Bi, Yp = 1 












* 




RSTBIT-A 


6A 


Bit Reset 


A 


Yi = Ai, Yp = 












* 




RSTBIT-B 


6B 


B 


Yi = Bi, Yp = 












* 




EXTBIT-A 


66 


Bit Extract 





ifp>0, Yo = Ap 2) 
if p < 0, Yo = Ap 






* 










EXTBIT-B 


67 





if p > 0, Yo - Bp 2) 
if p < 0, Yo = Bp 






* 










EXTBIT-STAT 


7E 





if p > 0, Yo = Sp 2) 
if p < 0, Yo = Sp 






* 











Notes: 1. These instructions use the field instruction format (FORMAT 2). 

2. Y31 to Yi are set to zero, "p" stands for the bit displacement from P0-P4 or from PR0-PR5. The sign of the position input i 
ignored. 



TABLE 14-2. BIT-MANIPULATION INSTRUCTIONS 



Mnemonics 


Code 


Description 


Status Register 


Y Output 


Status 


S 


lU 


L 


Z 


V 


N 


C 


SETBIT-STAT 


6C 


Status Bit Set 


Sp = 1 


S 


* 


• 


* 


• 


* 


* 


* 


RSTBIT-STAT 


6D 


Sp = 


8 


* 


* 


* 


* 


• 


* 


* 



Notes: 1. These instructions use the Field instruction format (FORMAT 2). 

2. "p" stands for the bit displacement from P0-P5 or from PR0-PR5. 



Legend: 



Unsel = Unselected field 
Sel = Selected field 
A = A Input 
B - B Input 
Q-Q Register 
' " Updated 



Examples: 



RSTBIT-B„3 
EXTBIT-STAT,,- 



3rd bit is set to in B 
4th bit In status register is extracted and 
Inverted. 
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Aligned Fields 



A: 



[,+ Ag •] 


A 


^A^ 1 




Non-Aligned Fields Case 1 : 



_____ 
SL,-«dlA, 




k-W-•+^P->l 



If position (f\)-P5] ^ 0, A is LS8 aligned 
Width (Wq-W^) s 1 to 32 



Non-Aliqned Fields Case 2: 
-W— M^P-^ 



A: 



3^ 1 >. 




AopB 



If position (Pq-Ps) < 0, B is LSB aligned 
Width (WQ-W5) = 1 to 32 



Figure 6. Field Logical Operations 
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TABLE 15. FIELD LOGICAL !NSTRUCT!ONS 



Mnemonics 


Code 


Description 


Y Output 


Status 


Unsel 


Sel 


S 


M 


L 


Z 


V 


N 


C 


PASSF-AL-A 


73 


Field Pass 


3) 
3) 

4) 


B 


Yi = Ai 
















PASSF-AL-B 


6F 


B 


Y| = Bi 
















PASSF-A 


72 


B 


if p>0, Y| = A|-p 
if p<0, Y|-p| = Ai 
















NOTF-AL-A 


71 


Field Complement 


3) 
3) 
4) 


B 


Yi = Ai 
















NOTF-AL-B 


6E 


B 


Yi = Bi 
















NOTF-A 


70 


B 


if p>0, Yi = Ai_p 
if p<0, Yi.p| = Ai 
















ORF-AL-A 


75 


Field OR 


3) 
4) 


B 


Yi = Ai OR B| 
















ORF-A 


74 


B 


If p>0, Yi = Ai_p OR Bi 

if p< 0, Yj_p| = Ai OR Bi_p| 
















XORF-AL-A 


77 


Field XOR 


3) 
4) 


B 


Yi = Ai XOR Bi 
















XORF-A 


76 


B 


If p>0, Yi = Ai-p XOR Bj 
if p<0, Yi_p| = Ai XOR Bi-pi 
















ANDF-AL-A 


79 


Field AND 


3) 
4) 


B 


Yi = Ai AND Bi 
















ANDF-A 


78 


B 


if p>0, Yi = Ai_p AND Bi 
if p<0, Yi_p| = Ai AND Bi_p| 
















EXTF-A 


7A 


Field Extract 


4)5) 
4) 5) 





if p>0, Yi = Ai_p 
if p<0, Yi_p| = Ai 








* 








EXTF-B 


78 





if p>0, Yi = Bi_p 
if p<0, Y|.p| = B| 
















EXTF-AB 


7C 





6) 
















EXTF-BA 


7D 





^' 








* 









Notes: 1. These instructions use the field instruction format (FORMAT 2). 

2- p<i<p +W-1. "p" stands for position displacement from P0-P5 or from PR0-PR5 and "w" for the width of the bit field 
from W0-W4 or WR0-WR4. Whenever p + w>32, operation takes place only over the portion of the field up to the end of 
the word. No wraparound occurs. 

3. This instruction uses the aligned format (see Figure 6). 

4. This instruction uses the unaligned field format (see Figure 6). 
p>0: Case 1 

p < 0: Case 2 

5. If p is positive, the input is LSB aligned and Y output aligned at position. 
If p is negative, the input is aligned at | p | and Y output at LSB. 

6. Firstly, the concatenation of A(High Word) and B(Low Word) is rotated by the amount specified by the position (p). If p is 
positive, left-rotate Is performed, if p is negative, right-rotate is performed. Secondly, the least significant bits on the Y output 
spedfied by the width (w) are extracted. 

7. Same as 6) except that B input is taken as a high word and A input as a low word. 

Legend: Unsel = Unselected Field 
Sel = Selected Field 
A = A Input 
B = B Input 
Q = Q Register 
* = Updated 



For all examples, assume STATUS (7:0) is -7 and STATUS (12:8) is 3. 



1. 0,PASSF-AL-B,11.20 

^- l iOOOOOOOOOOO| 00000101011 100110100 



Pass B to Y and test if B20 to B30 
are all zero. Set Z status if so. 



Z set to 1 in this case 



2. 3,X0RF-A., 



Exclusive-OR bits A7-A9 with bits 
Bo - B2 and output to Yq - Yg. Pass 
B3 - B31 to Y3 - Y31 . Width and po- 
sition values are obtained from STA- 
TUS(12:0). 

A: 01101 1 10001Q0100Q010n |lQO| l 101011 

B: 00011 100001010001 1001 01 001 OOlloOll 



A9_7©B2-o = Y: 00011 1 00001 01 00011 001 01 001 OOl fToT] 
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TABLE 


16. MASK INSTRUCTION 














Mnemonics 


Code 


Description 


Y Output 


Status 


Unsel 


Sel 


S 


IM 


L 


Z 


V 


N 


C 


PASS-MASK 


7F 


Generate Mask 


P5 


Yi = P5 

















Notes: 1. This instruction uses the field instruction format (FORMAT 2). 

2. p<i<p + w-1. "p" stands for the position displacement and "w" for the width of bit field. 



Legend: 



Unsel = Unselected Field 
Sel = Selected Field 
A = A Input 
B = B Input 
Q-Q Register 
* = Updated 



Example: 



Generates an 8-bit field mask pattern starting from bit position 10. 



31 



0, PASS-MASK, 8, 10 



18 17 10 9 

LWWWWWl 
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APPLICATIONS 

Suggestions for Power and Ground Pin 
Connections 

The Am29332 operates in an environment of fast signal rise 
times and substantial switching currents. Therefore, care must 
be exercised during circuit board design and layout, as with 
any high-performance component. The following is a sug- 
gested layout, but since systems vary widely in electrical 
configuration, anempirical evaluation of the intended layout is 
recommended. 

The VccT and GNDT pins, which cany output driver switching 
cun'ents, tend to be electrically noisy. The VccE and GNDE 
pins, which supply the ECL core of the device, tend to produce 
less noise, and the circuits they supply may be adversely 
affected by noise splices on the VcCE plane. For this reason, it 
is best to provide isolation between the Vcce aid Vqct Pihs, - 
as well as independent decoupling for each. Isolating the 
GNDE and GNDT pins is not required. 



Printed Circuit-Board Layout Suggestions 

1 . Use of a multi-layer PC board with separate power, ground, 
and signal planes is highly recommended. 

2. All Vcce and VccT pins should be connected to the Vcc 
plane. VccT Pins should be isolated from Vcce pins by means 
of a slot cut in the Vcce plane; see Figure 7. By physically 
separating the VccE and VccT Pins, coupled noise will be 
reduced. 

3. All GNDE and GNDT pins should be connected directly to 
the ground plane. 

4. The VccT pins should be decoupled to ground with a O.I-jiF 
ceramic capacitor and a 10-/1F electrolytic capacitor, placed 
as closely to the Am29332 as is practical. VccE Pins should 
be decoupled to ground in a similar manner. 

A suggested layout is shovm in Figure 7. 



ABCDEFGHJKLMNPRTU 



C2: 



O 

Cl 
C2 




:c3 



A 



1 

2 
3 
4 
5 
6 
7 
8 

g 
10 
11 
12 

13 
14 
15 
16 
17 



' Isolation Cut 



CDO10471 



:c4 



Through Hole 

Vcc Plane Connection 

C3 = C5 = IOjjF or greater (electrolytic or tsui- 

talum capacitor) 

C4 - Ce = O.ljiF or greater (ceramic or 

monolithic capacitor) 



Figure 7. Suggested Printed Circuit-Board layout 
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Figure 8. Am29332 Thermal Characteristics (Typical) 
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ABSOLUTE MAXIMUM RATINGS 

Storage Temperature -65 to +15 


( 

rc Commercial (C) 
5°C Temperature 
Supply Volta 
D V 

Operating rangt 


OPERATING RANGES 

Case Devices 






Temperature Under Bias - Tc -55 to +12 

Supply Voltage to Ground Potential 

Continuous -0.5 to +7. 

DC Voltage Applied to Outputs 

for HIGH State 05 V to +Vnr H 


(Tc) to +85°C 

ge Mr-Q +475 V to +525 V 


)S define those limits t>etween which the 


DC Input Voltage.. -0.5 to +5.5 V 


Stresses above those listed under ABSOLUTE MAXIMUM 
RATIN3S may cause permanent device failure. Functionality 
at or above these liiruts is not implied. Exposure to absolute 
maxmum ratings for extended periods may affect device 
reliability. 

DC CHARACTERISTICS over operating range 


Parameter 
Symbol 


Parameter 
Description 


Test Conditions 

(Note 1) 


Min. 


IMax. 


Units 


VOH 


Output HIGH Voltage 


Vcc - 4.75 V 
V|N = V|HorV|L, 
IOH = -1.2 mA 


All Outputs 


-2.4 




Volts 


Vol 


Output LOW Voltage 


Vcc - 4.75 V, 
V|N = V|H0rV|L, 
IOL-8 mA 


All Outputs 




0.5 


Volts 


V|H 


Input HIGH Level (Guaranteed Logic HIGH 
Voltage) 




All Inputs 


2.0 




Volts 


VlL 


Input LOW Level (Guaranteed Logic LOW 
Voltage) 




All Inputs 




0.8 


Volts 


V| 


Input Clamp Voltage 


Vcc = 4.75 V, 
l|N = -18 mA 


All Inputs 




-1.5 


Volts 


IPL 


Input LOW Cun-ent 


Vcc - S.25 V 
V|N = 0.6 V 


PYo-3, 
Yo-31 




-0.55 


mA 


U-e 




-1.50 


I7-8 




-1.00 


SLAVE 




-3.00 


OE-Y 




-2.50 


CLK 




-Z.00 


C, Z, V, N, L; 
PERR 




-0.55 


Other 




-0.50 


l|H 


Input HIGH Cun^nt 


Vcc = 5.25 V, 
ViN-2.4 V 


PYo-3, 

Yo-31 




100 


M 


I4-S 




150 


I7-8 




100 


SLAVE 




300 


6E.y 




250 


CLK 




200 


C, Z, V, N, U 
PERR 




100 


Other 




50 


ll 


Input HIGH Current 


Vcc - 5.25 V, 
V|N - 5.5 V 


All 
Inputs 




1.0 


mA 


lOZH 


Off State Output Current 


Vcc = 5.25 V. 
Vo = 2.4 V 


All 

Outputs 
Except 
MSERR 




100 


ma 


IQZL 


Vcc = 5.25 V, 
Vq = 0.5 V 




-560 


los 


Output Short-Cireuit Current 
(Note 2) 


Vcc = 5.75 V, 
Vo = 0.5 V 




-15 


-50 


mA 


Ice 


Power Supply Current 
(Note 3) 


Vcc = 5.26 V 


Tc - to 85°C 




1800 


mA 


Tc = 85°C 




1690 


mA 


Notes: 1 . For conditions shown as Min. or Max., use the appropriate value specified under Operating Ranges for the applicable device type. 

2. Not more than one output should be shorted at a time. Duration of the short circuit test should not exceed one second. 

3. Measured with all inputs HIGH and outputs disabled. 
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SWITCHING CHARACTERISTICS over operating range 
A. COMBINATIONAL PROPAGATION DELAYS 












No. 


From 


To 


Am29332 


Am29332A 


Unit 




Max. Delay 


Max. Delay 


1 


PA0-PA3, PB0-PB3 


PERR 


19 


16 


ns 


2 


DA0-DA31, DB0-DB31 


PERR 


28 


24 


ns 


3 


DA0-DA31, DB0-DB31 


PY0-PY3 


42 


36 


ns 


4 


DA0-DA31, DB0-DB31 


Y0-Y31 


35 


30 


ns 


5 


DA0-DA31, DB0-DB31 


C, Z, V, N, L 


43 


37 


ns 


6 


DA0-DA31, DB0-DB31 


MSERR 


49 


42 


ns 


7 


lo-is 


PY0-PY3 


53 


'■ -A^'- 


ns 


8 


I0-I8 


Y0-Y31 


47 


40 


ns 


9 


lo-ie 


C, Z, V, N, L 


48 


41 


ns 


10 


lo-ie 


MSERR 


55 


' Af 


ns 


11 


W0-W4 


PY0-PY3 


40 


34 


ns 


12 


W0-W4 


Y0-Y31 


34 


29 


ns 


13 


W0-W4 


C, Z, V, N, L 


35 


30 


ns 


14 


W0-W4 


MSERR 


41 


35 


ns 


15 


P0-P5 


PY0-PY3 


48 


41 


ns 


16 


P0-P5 


Y0-Y31 


42 


36 


ns 


17 


P0-P5 


C, Z, V, N, L 


43 


37 


ns 


18 


P0-P5 


MSERR 


45 


39 


ns 


19 


CP 


PY0-PY3 


47 


40 


ns 


20 


CP 


Yo-Ysi 


41 


35 


ns 


21 


CP 


C, Z, V, N, L 


42 


■-& 


ns 


22 


CP 


STATUS REG. 


20 


17 


ns 


23 


RS 


C, Z, V, N, L 


16 


' 14 


ns 


24 


MCin 


Y0-Y31 


31 


27 


ns 


25 


MQn 


C, Z, V, N, L 


34 


29 


ns 


26 


MCin 


MSERR 


37 


32 


ns 


27 


MLINK 


Y0-Y31 


33 


28 


ns 


28 


MLINK 


C, Z, V, N, L 


37 


32 


ns 


29 


MLINK 


MSERR 


38 


33 


ns 


30 


M/S 


Y0-Y31 


33 


■ 28 


ns 


31 


Wm 


C, Z, V, N, L 


37 


"33 


ns 


32 


M/m 


MSERR 


38 


33 


ns 


33 


BOROW 


Y0-Y31 


33 


. -28 


ns 


34 


BOROW 


C, Z, V. N, L 


37 


-32 


ns 


35 


BOROW 


MSERR 


38 


" 33- '■ 


ns 


36 


HOLD 


C, Z, V, N, L 


22 


. 19k' 


ns 


37 


HOLD 


MSERR 


29 


25- 


ns 


38 


PY0-PY3 


MSERR 


20 


17 


ns 


39 


Y0-Y31 


MSERR 


19 


16 


ns 


40 


C, Z, V, N, L 


MSERR 


21 


18 


ns 


41 


PERR 


MSERR 


20 


17 


ns 
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SWITCHING CHARACTERISTICS (Cont'd.; 
B. SETUP AND HOLD TIMES 



No. 



53 



60 



Parameter (Note 2) 



Input Data Setup 



Input Data Hold 



Byte Width Setup 



Byte Width Hold 



Instruction Setup 



Instruction Hold 



Width Setup 



Width Hold 



Position Setup 



Position Hold 



Bon-ow Setup 



Borrow Hold 



Macro Carry Setup 



Macro Carry Hold 



Macro Link Setup 



Macro Link Hold 



Macro/Micro Setup 



For 



DA0-DA31, DBq - DB31 



DA0-DA31, DB0-DB31 



I7-II 



With Respect To 

cpT 



cp T 



I7-I8 



I0-15 



lo-ie 



W0-W4 



W0-W4 



P0-P5 



P0-P5 



BOROW 



MCin 



MCin 



MLINK 



Macro/Micro Hold 



Hold Mode Setup 



Hold Mode Hold 



M/m 



cpT 

cpT 



Arn29332 



Max. Value 



31 



cpT 



cpT 



cpT 



cpT 
cpT 



CPT 



cpT 
cpT 



cpT 
cpT 



cpT 
cpT 



cpT 
cpT 



cpT 

cpT 



Am29332A 



Max. Value 



31. 



;30: 



Sifa; 



'^K 



28 



,22:; 



2t. 



■^I- 



'%-J, 



C. MINIMUM CLOCK REQUIREMENTS 



No. 


Description 


Am29332 


Am29332A 


Unit 


Max. Value 


Max. Value 


62 


Minimum Clock LOW Time 


20 


20 


ns 


63 


Minimum Clock HIGH Time 


20 


20 


ns 



D. ENABLE AND DISABLE TIMES 



Unit 



No. 


From 


To 


Description 


Am29332 


Am29332A 


Unit 


Max. Delay 


Max. Delay 


64 


6E-Y 


Y0-Y31, PY0-PY3 


Output Enable Time 


25 


25 


ns 


65 


Oe-y 


Y0-Y31, PY0-PY3 


Output Disable Time 


25 


25 


ns 


66 


SLAVE 


C, Z, V, N, L 
PERR 


Slave Mode 
Enable Time 


25 


„,.. 2k* 

9"' 


ns 


67 


SLAVE 


Y0-Y31, PY0-PY3 
C, Z, V, N, L 
PERR 


Slave Mode 
Disable Time 


25 


25 


ns 



■" "■» i=»K""=i"i"'y ui Lie user lo maimain a case temperature ot e&-c or less. AMD recommends an air velocitv of at 
least 200 linear feet per minute over the heatsink. 
2 See timing diagram for desired mode of operation to determine clock edge to which these setup and hold times apply. 
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SWriCHING TEST CIRCUITS 




VouT' 



Rj = 6K > =p C| 



^<l— 



R1-- 



6.0-Vbe-Vol 



lOL- 



Vol 

IK 



TC001102 



A. Three-Slate Outputs 



R2 



Ri = 



2.4 V 

lOH 

5.0-Vbe-VOL 

lOL + Vql 
Rs 



B. Normal Outputs 



Notes: 1. Cl-50 pF includes scope probe, wiring and stray capacitances without device in test fixture. 

2. Si. S2, S3 are closed during function tests and all AC tests except output enable tests. 

3. Si and S3 are closed while 83 is open (or tpzH test. 
Si and Sj are dosed while S3 is open for tpzt test. 

4. Cl = 5.0 pF for output (Usable tests. 



SWITCHING TEST WAVEFORMS 



DATA 
INPUT" 




1-..^ 



TIMING 
INPUT ' 



,J 




f 



3 V 
1.5 V 
' V 

■ 3 V 
- 15 V 
-0 V 



Setup, Hold, and Release Times 



Notes: 1 . Diagram shown for HIGH data only. Output transition 
may be opposite sense. 
2. Cross hatched area is don't care condition. 



LOWHIGH'LOW 
PULSE 



/=\ 



HIGH-LOW HIGH 
PULSE" 



\=/ 



WFR02970 



Pulse Width 
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SWITCHING TEST WAVEFORMS (Cont'd.) 

Enable 



Disable 



/ 



OPPOSITE PHASE 
INPUT transition" 



\ 




C(>ITROL_ 
INPUT 



output 
normally 

LOW 



- 3 V 

- 1.5 V 

- V 



:^ 



-<ZH 



0,5 V 

-1.5 V 



Vol 



output 
normally 

high s2open 









Propagation Delay 



Enable and Disable Times 

Notes: 1. Diagram shown for Input Ckintrol Enable-LOW and Input Control 
Disable-HIGH. 
2. St, S2 and S3 of Load Circuit are closed except wfiere shown. 



Test Philosophy and Methods 

The following points give the general philosophy that we apply 
to tests that must be properly engineered If they are to be 
implemented in an automatic environment. The specifics of 
what philosophies applied to which test are shown. 

1 . Ensure the pari is adequately decoupled at the test head. 
Large changes in supply cun-ent when the device switches 
may cause function failures due to Vcc changes. 

2. Dp not leave inputs floating during any tests, as they may 
oscillate at high frequency. 

3. Do not attempt to perform threshold tests at high speed. 
Following an input transition, ground current may change by 
as much as 400 mA in 5 - 8 ns. Inductance in the ground 
cable may allow the ground pin at the device to rise by 
hundreds of millivolts momentarily. 

4. Use extreme care in defining input levels for AC tests. Many 
inputs may be changed at once, so there will be significant 
noise at the device pins that may not actually reach V|l or 
V|H until the noise has settled. AMD recommends using 
V|L<0 V and V|H>3 V for AC tests. 

5. To simplify failure analysis, programs should be designed to 
perform DC, Function, and AC tests as three distinct groups 
of tests. 

6. Capacitive Loading for AC Testing 

Automatic testers and their associated hardware have stray 
capacitance that varies from one type of tester to another, 
but is generally around 50 pF. This, of course, makes it 
Impossible to make direct measurements of parameters 
that call for a smaller capacitive load than the associated 
stray capacitance. Typical examples of this are the so- 
called "float delays" which measure the propagation 
delays into and out of the high impedance state and are 
usually specified at a load capacitance of 5.0 pF. In these 
cases, the test is performed at the higher load capacitance 
(typically 50 pF) and engineering correlations based on 
data taken with a bench set up are used to predict the 
result at the lower capacitance. 



Similarly, a product may be specified at more than one 
capacitive load. Since the typical automatic tester is not 
capable of switching loads in mid-test, it is impossible to 
make measurements at both capacitances even though 
they may both be greater than the stray capacitance. In 
these oases, a measurement is made at one of the two 
capacitances. The result at the other capacitance is 
predicted from engineering correlations based on data 
taken with a bench set up and the knowledge that certain 
DC measurements (Iqh, Iol, for example) have already 
been taken and are within specification. In some cases, 
special DC tests are performed in order to facilitate this 
correlation. 

7. Threshold Testing 

The noise associated with automatic testing, the long, 
inductive cables, and the high gain of bipolar devices when 
in the vicinity of the actual device threshold, frequently give 
rise to oscillations when testing high-speed speed circuits. 
These oscillattons are not indicative of a reject device, but 
instead, of an overtaxed test system. To minimize this 
problem, thresholds are tested at least once for each input 
pin. Thereafter, "hard" HIGH and LOW levels are used for 
other tests. Generally this means that function and AC 
testing are performed at "hard" input levels rather than at 
V|L Max. and V|h Min. 

8. AC Testing 

Occasionally, parameters are specified that cannot be 
measured directly on automatic testers because of tester 
limitations. Data Input hold times often fall into this catego- 
ry. In these cases, the parameter in question is guaranteed 
by correlating these tests with other AC tests that have 
been performed. These correlations are arrived at by the 
cognizant engineer by using data from precise bench 
measurements in conjunction with the knowledge that 
certain DC parameters have already been measured and 
are within specification. 

In some cases, certain AC tests are redundant since they 
can be shown to be predicted by other tests that have 
already been performed. In these cases, the redundant 
tests are not performed. 
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SWITCHING WAVEFORMS 
KEY TO SWITCHING WAVEFORMS 



WAVEFORM INPUTS 



ISl 



DON'T CARE; 
ANY CHANGE 
PERMITTED 



1-® 



WILL BE 
CHANGING 
FROM H TO L 



WILL BE 
CHANGING 
FROM L TO H 



CHANGING; 

STATE 

UNKNOWN 



CENTER 
LINE iSHIGH 
IMPEDANCE 
"OFF" STATE 



-@f 



J 



\ 



-®- 



«!SSsssk: 



-@- 



'r'8 



Wq-W^ 



'W 



mmxm 



=^ 



mmmm 



mmmm 



BOROW 



)mmmm 



xxmmxxxxx 



/ 



'r*—® 



TOM 



=g -^-M 



'mm 



mm 



ig ;..-@) 

'ife? 



^m 



mMmmi 



mm'^mm 



-®- 



HOLD 



mmmmnm: 



■■*—®^=:^ 



-"Ms 




Setup and Hold Timing 
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INPUTS* 
PERR 

C,Z,N,V,L 
MSERR 
Status Regi 

Inputs: PA0-PA3, 
MCin, MLIN 


SWITCHING WAVEFORMS (Cont'd.) 


XXXXX 


;-•-<])-©— ^ 


XXXXXXXXXXX 


:-•— © ®©©®® K 


xxxxxxxxxxxxxxxxxx 


L® ®®®®@J2)@®® 




mmmmcK 


)CCCOOOOOCOOOO(XX 

;,® ®®®®@@©®®^®®© 


xxxxxxxxxxxxxxxxxxx 

'<* @ K 


WF023691 

Propagation Delays (SLAVE = LOW) 

=Bo-PB3, DA0-DA31, DB0-DB31, lo-ls, W0-W4, P0-P5. CP, RS, 
K, M/m, BOROW, HOLD 
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SWITCHING WAVEFORMS (Cont'd.) 

mi 



V. M)(zz: 



c,z, 



<""■ WC 



PERR 



mc 



'<< — @ — »i 

"-" xxxxxxxxxxxxi r 



OE-Y 



^0"Y31 ■ 
PY0-PY3 



SLAVE 



^0 "Y31 
PY0-PY3 
C,Z,V,N,L 
PERR 



WF023700 

Propagation Delay (SLAVE = HIGH) 



A /T 



\*-<^^. 

n^ 



Enable/Disable I (SLAVE = HIGH) 






WF023720 



Enable/Disable II (OE-Y = LOW) 
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INPUT/OUTPUT CIRCUIT DIAGRAM 

(Alt Devices) 



DRIVING OUTPUT 



k)H 






•OL 



DRIVEN INPUT 



IlL 



O f- 



^<^ 



< 



>IH 



T 



* 



ICR00480 
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Am29334 

Four-Port Dual-Access Register File 



> 

3 

IS) 
(O 

u 
w 



DISTINCTIVE CHARACTERISTICS 



• Fast 



With an access time of 24 ns, the Am29334 supports 
80-90 ns microcycle time when used with the Am29300 
Family for 32-bit systems. 
64 X 18 Bits Wide Register File 
The Am29334 is a high-performance, high-speed, dual- 
access RAM with two READ ports and two WRITE 
ports. 

Cascadabie 

The Am29334 is cascadabie to support either wider 
word widths, deeper register files, or both. 



Simplified Timing Control 

Control for write enable timing and for on-chip read/ 
write address multiplexer are derived from a single- 
phase clocl< input. 
Byte Parity Storage 

Width of 18 bits facilitates byte parity storage for each 
port and provides consistency with the Am29332 32-bit 
ALU. 

Byte Write Capability 

Individual byte write enables allow byte or full word 
write. 



GENERAL DESCRIPTION 



The Am29334 is a 64-word deep and 18-bit wide dual- 
access register file designed to support other members of 
the Am29300 Family by providing high-speed storage. It 
has two write and two read ports for data and four 6-bit 
address ports. Two address ports are associated with each 
pair of read and write data ports, one to read data and the 
other to write. The device is capable of performing two 
reads and two writes in one cycle. The 1 8-bit wide register 



file allows storage of byte parity to support parity check and 
generate in the Am29332 32-bit ALLI. Independent control 
for each read and write data port allows the Am29334 to be 
used as a high-speed shared memory or as a mailbox for a 
multiprocessor system. The device is designed with an 
access time of 24 ns. If is housed in a 120-lead pin-grid- 
array package. 
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Publication # Rev. Amendment 

05731 E /O 
Iss ue Date: August 1987 ^ 



RELATED AMD PRODUCTS 



Part No. 


Description 


Am29325 


32-Bit Floating Point Processor 


Am29331 


16-Bit Microprogram Sequencer 


Am29332 


32-Bit Extended Function ALU 



CONNECTION DIAGRAM 

ABCD EFGHJKLHN 

1 (ima ARA2 AWA1 DAOO DA02 DAOi 0A08 DAtl9 DAI J DAI 6 LEA WEAO WEAI>S, 



ARA3 AWA3 ARAl ARAO DA03 DA06 DAC? DA10 DA13 DAIS ARA5 AWA5 WEAH 



AWA4 ARA4 YBOO AWAO DA01 GNDE DA06 VOCE DA11 0A14 DA17 ARB4, AWB4 



YB01 YB02 YB03 



GNDT YB04 YB05 



YB07 YB06 VCCT 



YB08 YB09 YB10 



YB12 YB11 OEB 



GMJT YB13 YBU 



YB15 YB16 YB17 



YAOO YA01 YA02 



YA03 YA04 QMJT 



OEA YA06 YA05 



YA07 YA08 YA09 



VCCI YA11 YA10 



YA12 YA13 GHJT 



YA14 YA15 YA16 



WEBL WEBH DB01 DB04 VOCE DBO! DB09 DB15 G^EE ARBO YA17 ARB3 AWB3 



WEBC LEB DBOO DB03 VOCE OBOS DBll DB12 ONDE DB17 AWBO AWE2 AR82 



13 \AWB5 AHB5 DB07 DB02 VCCE DB06 DB10 0B14 aCE DB16 D813 ARBl AWBl/ 



CD010391 



Note: GNDT = TTL GND 
GNDE = ECL GND 
VCCT = TTL VCC 
VCCE = ECL VCC 
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TABLE OF INTERCONNECTIONS 


















(Sorted by 


Pin No.) 










PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


- 


- 


39 


C-5 


Yb5 


115 


H-2 


Daio 


10 


M-5 


Ya4 


80 


- 


- 


37 


C-€ 


TTL Vcc 


113 


H-3 


ECL Vcc 


68 


M-6 


Ya6 


81 


- 


- 


99 


C-7 


Ybio 


52 


H-11 


Db15 


34 


M-7 


Yab 


82 


- 


- 


97 


C-8 


OEb 


53 


H-12 


Db12 


95 


M-8 


Yaii 


25 


A-1 


AWA2 


1 


C-9 


Ybi4 


109 


H-13 


Dbu 


94 


M-9 


Yai 3 


86 


A-2 


Ara3 


120 


C-10 


Yb17 


48 


J-1 


Da12 


11 


M-10 


Yai 5 


87 


A-3 


AwA4 


59 


C-11 


Dbi 


44 


J-2 


Da13 


71 


M-11 


ArB3 


89 


A-4 


Ybi 


58 


C-12 


Dbo 


104 


J-3 


Daii 


70 


M-12 


AWB2 


30 


A-5 


TTL GND 


56 


C-13 


Db7 


41 


J-11 


ECL GND 


38 


M-13 


Arbi 


91 


A-6 


Yb7 


114 


l>1 


Dao 


4 


J-12 


ECL GND 


38 


N-1 


WEal 


16 


A-7 


Ybb 


54 


D-2 


Arao 


63 


J-13 


ECL GND 


38 


N-2 


WEah 


76 


A-8 


Ybi2 


51 


D-3 


AwAO 


3 


K-1 


Da16 


13 


N-3 


AWB4 


17 


A-9 


TTL GND 


50 


D-11 


Db4 


102 


K-2 


Da15 


72 


N-4 


Ya2 


19 


A-10 


Yb15 


49 


D-12 


Db3 


43 


K-3 


Dai4 


12 


N-S 


TTL GND 


20 


A-11 


WEbl 


47 


D-13 


Db2 


103 


K-11 


ArbO 


92 


N-6 


Ya5 


21 


A-12 


WEbc 


106 


E-1 


Da2 


5 


K-12 


Dbi 7 


33 


N-7 


Ya9 


24 


A-13 


AWB5 


46 


E-2 


Da3 


65 


K-13 


Dbi6 


93 


N-8 


Ya10 


84 


B-1 


ArA2 


61 


E-3 


Dai 


64 


L-1 


LEa 


14 


N-9 


TTL GND 


26 


B-2 


AWA3 


60 


E-11 


ECL Vcc 


98 


L-2 


Aras 


74 


N-10 


Yai 6 


28 


B-3 


ArA4 


119 


E-12 


ECL Vcc 


98 


L-3 


Da17 


73 


N-11 


AwB3 


29 


B-4 


Yb2 


117 


E-13 


ECL Vcc 


98 


L-4 


Yao 


18 


N-12 


ArB2 


90 


B-5 


Yb4 


116 


F-1 


Da4 


6 


L-5 


Ya3 


79 


N-13 


AWB1 


31 


B-6 


Yb6 


55 


F-2 


Da5 


66 


L-6 


ObA 


23 








B-7 


Yb9 


112 


F-3 


ECL GND 


8 


L-7 


Ya7 


22 








B-8 


Yb11 


111 


F-11 


Db8 


100 


L-8 


TTL Vcc 


83 








B-9 


Yb13 


110 


F-12 


Db5 


42 


L-9 


Ya12 


85 








B-10 


Yb16 


108 


F-13 


Db6 


101 


L-10 


Yau 


27 








B-11 


WEbh 


107 


G-1 


Da8 


9 


L-11 


Ya17 


88 








B-12 


LEb 


45 


G-2 


Da7 


67 


L-12 


AWBO 


32 








B-13 


Arbs 


105 


G-3 


Da6 


7 


L-1 3 


Db13 


35 








C-1 


AwA1 


2 


G-11 


Db9 


40 


M-1 


WEac 


75 








C-2 


Arai 


62 


G-12 


Dbii 


36 


M-2 


AwA5 


15 








C-3 


Ybo 


118 


G-13 


Dbio 


96 


M-3 


ArB4 


77 








C-4 


Yb3 


57 


H-1 


Da9 


69 


M-4 


Yai 


78 








Notes; 


















1. Pins E-1, E 


12 and 


E-13 are 


physically shorted toge 


ther in t 


ie package. 












2. Pins J-11, J 


-12 and 


J- 13 are 


physically shorted toge 


sther in 1 


he package. 








I 



3-76 









TABLE 


OF INTERCONNECTIONS (Cont'd.) 
















(Sorted by 


Pin Name) 












PIN NAME 


PIN 


PAD 


PIN NAME 


PIN 


PAD 


PIN NAME 


PIN 


PAD 


PIN NAME 


PIN 


PAD 




NO. 


NO. 




NO. 


NO. 




NO. 


NO. 




NO. 


NO. 


_ 


_ 


97 


Da3 


E-2 


65 


Dbi6 


K-1 3 


93 


Ya4 


M-5 


80 


- 


- 


99 


Da4 


F-1 


6 


Db17 


K-1 2 


33 


Yas 


N-6 


21 


- 


- 


39 


Das 


F-2 


66 


ECL GND 


J-1 2 


38 


Ya6 


M-6 


81 


- 


- 


37 


Da6 


G-3 


7 


ECL GND 


F-3 


8 


Ya7 


L-7 


22 


Arao 


D-2 


63 


Da7 


G-2 


67 


ECL GND 


J-11 


38 


Ya8 


M-7 


82 


Arai 


C-2 


62 


Da8 


G-1 


9 


ECL GND 


J-1 3 


38 


Ya9 


N-7 


24 


ArA2 


B-1 


61 


Da9 


H-1 


69 


ECL Vcc 


H-3 


68 


Yaio 


N-8 


84 


ArA3 


A-2 


120 


Daio 


H-2 


10 


ECL Vcc 


E-13 


98 


Ya12 


L-9 


85 


ArA4 


B-3 


119 


Daii 


J-3 


70 


ECL Vcc 


E-11 


98 


Ya13 


M-9 


86 


ArA5 


L-2 


74 


Da12 


J-1 


11 


ECL Vcc 


E-12 


98 


Ya14 


L-10 


27 


Arbo 


K-11 


92 


Da13 


J-2 


71 


LEa 


L-1 


14 


Yais 


M-10 


87 


Arbi 


M-13 


91 


Da14 


K-3 


12 


LEb 


B-1 2 


45 


Ya16 


N-10 


28 


ArB2 


N-12 


90 


Dais 


K-2 


72 


OEa 


L-6 


23 


Ya17 


L-11 


88 


ArB3 


M-11 


89 


Da16 


K-1 


13 


OEb 


C-8 


53 


Ybo 


C-3 


118 


ArB4 


M-3 


77 


Da17 


L-3 


73 


TIL GND 


A-5 


56 


Yb1 


A-4 


58 


ArB5 


B-1 3 


105 


Dbo 


C-1 2 


104 


TTL GND 


A-9 


50 


Yb2 


B-4 


117 


Awao 


D-3 


3 


Dbi 


C-11 


44 


TTL GND 


N-5 


20 


Yb3 


C-4 


57 


Awai 


C-1 


2 


Db2 


D-13 


103 


TTL GND 


N-9 


26 


Yb4 


B-5 


116 


AWA2 


A-1 


1 


Db3 


D-12 


43 


TTL Vcc 


C-6 


113 


Ybs 


C-5 


115 


AwA3 


B-2 


60 


Db4 


D-11 


102 


TTL Vcc 


L-8 


83 


Yb6 


B-6 


55 


AWA4 


A-3 


59 


Dbs 


F-1 2 


42 


WEac 


M-1 


75 


Yb7 


A-6 


114 


AWAS 


M-2 


15 


Db6 


F-1 3 


101 


WEah 


N-2 


76 


Yb8 


A-7 


54 


AWBO 


L-12 


32 


Db7 


C-13 


41 


WEal 


N-1 


16 


Yb9 


B-7 


112 


Awbi 


N-13 


31 


Dbs 


F-11 


100 


WEbc 


A-1 2 


106 


Yb10 


C-7 


52 


AwB2 


M-12 


30 


Db9 


G-11 


40 


webh 


B-11 


107 


Ybii 


B-8 


111 


AwB3 


N-11 


29 


Dbio 


G-13 


96 


WEbl 


A-11 


47 


Yb12 


A-8 


51 


AWB4 


N-3 


17 


Dbii 


G-12 


36 


Yao 


L-4 


18 


Yb13 


B-9 


110 


AWB5 


A-1 3 


46 


Db12 


H-1 2 


95 


Yai 


M-4 


78 


Yb14 


C-9 


109 


Dao 


D-1 


4 


Db13 


L-13 


35 


Ya11 


M-8 


25 


Ybis 


A-10 


49 


Da1 


E-3 


64 


Db14 


H-1 3 


94 


Ya2 


N-4 


19 


Yb16 


B-10 


108 


DA2 


E-1 


5 


Db15 


H-11 


34 


Ya3 


L-5 


79 


Yb17 


C-10 


48 
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METALLIZATION AND PAD LAYOUT 

tf < <<<<QDQQaaQOtD > OQOQOOQOOO_j<<S 




m m m m 



j'n s o" o Q-cfrfrf Q-oD ,= 2 iiiss sisiii i 



Die Size: 258x251 mils 
Equivalent Gate Count: 3500 



ORDERING INFORMATION 

Standard Products 



AMD standard products are available in several packages and operating ranges. The order number (Valid 
Combination) is formed by a combination of: a. Device Number 

b. Speed Option (if applicable) 

c. Package Type 

d. Temperature Range 

e. Optional Processing 



-a. DEVICE NUMBER/DESCRIPTION 

Am29334 

Four-Port Dual-Access Register File 



-e. OPTIONAL PROCESSING 

Blank - Standard processing 
B " Bum-In 



-d. TEMPERATURE RANGE 

C - Commercial (Tc = to ■^ e5°C) 



-C. PACKAGE TYPE 

G - 120-LGad Pin Grid An-ay with Heatsink 
(CG 120) 



b. SPEED OPTION 

Not Applicable 



Valid Combinations 



GC, GCB 



Valid Combinations 

Valid Combinations list configurations planned to be 
supported in volume for this device. Consult the local AMD 
sales office to confirm availability of specific valid 
combinations, to check on newly released valid combinations, 
and to obtain additional data on AMD's standard military 
grade products. 
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PIN DESCRIPTION 



Arao~Ara5 Addresses (Inputs, Active HIGH) 

The 6-bit field presented at the ARa inputs, selects one of 
64 memory words for presentation to the Ya Data Latch. 

Arbo-Arb5 Addresses (Inputs, Active HIGH) 

The 6-bit field presented at the ARg inputs, selects one of 
64 memory words for presentation to the Yb Data Latch. 

Yao-Yai7 Data Latch (Outputs, Three-State) 

The 18-bit Ya Data Latch outputs. 

Ybo-Ybi7 Data Latch (Outputs, Three-State) 

The 18-bit Yg Data Latch outputs. 

AwaO'Awas Addresses (Inputs, Active HIGH) 

The 6-bit field presented at the AWa inputs, selects one of 
64 words for writing new data from the Da inputs. 

AwBO'AwBS Addresses (Inputs, Active HIGH) 

The 6-bit field presented at the AWg Inputs, selects one of 
64 words for writing new data from the Dg inputs. 

Dao-Dai7 Data (Inputs, Active HIGH) 

New data is written Into the word, selected by the AWa 
address Inputs, through these inputs. 

Dbo - Dbi7 Data (Inputs, Active HIGH) 

New data is written Into the word, selected by the AWb 
address Inputs, through these inputs. 

LEa Ya Data Latch Enable (Input) 

The LEa input controls the latch for the Ya output port. 
When LEa is HIGH, the latch is open (transparent), and data 
from the RAM, as selected by the ARa address inputs, is 
present at the Ya outputs. When LEa is LOW, the latch is 
closed and it retains the last data read from the RAM 
selected by the ARa address Inputs. 

LEb Yb Data Latch Enable (Input) 

The LEb input controls the latch for the Yb output port. 
When LEb is HIGH, the latch is open (transparent), and data 
from the RAM, as selected by the ARb address Inputs, is 
present at the Yb outputs. When LEb is LOW, the latch is 
closed and It retains the last data read from the RAM 
selected by the ARb address inputs. 

OEa YA_Output Enable (input. Active LOW) 

When OEa is LOW, data in the Ya Data Latch is present at 
the Ya outputs. If OEa is HIGH, Ya outputs are in the high- 
Impedance (off) state. 



OEb Ys^Output Enable (Input, Active LOW) 

When OEb is LOW, data in the Yb Data Latch is present at 
the Yb outputs. If OEb is HIGH, Yb outputs are in the high- 
impedance (off) state. 

WEac Write Enable (input. Active LOW) 

When WEac is LOW together with WEah and WEal, new 
data is written into the word selected by the AWa address 
inputs. When WEac is HIGH, no data is written into the RAM 
through the A port. 

WEbc Write Enable (input. Active LOW) 

When WEbc is l-OW together with WEbh and WEbl, new 
data is written into the word selected by the AWb address 
inputs. When WEbc is HIGH, no data Is written into the RAM 
through the B port. 

WIah High-Byte Write Enable (input. Active LOW) 

When WEah is LOW together with WEac. "©w data is 
written into the high byte of the word selected by the AWa 
address inputs. When WEah is HIGH, no data is written into 
the high byte of the word selected by the AWa address 
inputs. 

WEbh J^h-Byte Write Enable (Input, Active LOW) 

When WEbh is LOW together with WEbc. n©* data is 
written into the high byte of the word selected by the AWb 
address Inputs. When WEbh is HIGH, no data is written into 
the high byte of the word selected by the AWb address 
inputs. 

WEal Low-Byte Write Enable (Input, Active LOW) 

When WEal is LOW together with WEac, new data is 
written into the low byte of the word selected by the AWa 
address inputs. When WEal is HIGH, no data is written into 
the low byte of the word selected by the AWa address 
inputs. 

WEbl Low-Byte Write Enable (Input, Active LOW) 

When WEbl is LOW together with WEbc. new data is 
written into the low byte of the word selected by the AWg 
address inputs. When WEbl is HIGiH, no data is written into 
the low byte of the word selected by the AWb address 
inputs. 
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FUNCTIONAL DESCRIPTION 

The part has two read ports (Yao-Yai7, Ybq-Ybi?), two 
write ports (Dao-Dai7, Dbo-Dbi7), four addresses 
(Arao-Ara5. Awao-AwA5. Arbq-Arbs, Awbq-Awbs), 
two latch enables (LEa, LEg). two output enables (OEa, OEb), 
and six write enables (WEac. WEal, WEah, WEbc, WEbl, 
WEbh) tbat allow writing of data into one or both bytes of a 
word. The separate read and write addresses facilitate cre- 
ation of three- and four-address architectures and allow 
address set-up and RAM access to overlap. 

Since the A and B sides are identical, only operation of the A 
side is described. The address multiplexer provides the RAM 
with the address Ara when WEac = HIGH and with the 
address AwA when WEac = LOW. Internally the part is 
designed so that there is no race condition between the write 
address and the write enable. In most cases WEac and LEa 
will be connected to the clock as shown in Figure 2 so that 
reading will talte place in the first part of a clock cycle and 
writing in the last part. The latch at the output of the RAM is 
transparent when LEa = HIGH and retains the data when 
LEa = LOW. The latch has a three-state output Ya controlled 
by OEa. Each word is split into two bytes of 9 bits that can be 
individually written. The low byte covers bits through 8 and 
the high byte covers bits 9 through 1 7. One or both bytes of 
the data at Da are written into the location given by AwA when 
the common write enable (WEac) and the appropriate byte 
write enables (WEal and WEah) are active. Two special 
cases then arise. First, if a location is written into and read at 



the same time, the value read is the value being written. 
Second, if a location is written into from both the A side and 
the B side, the value written is undefined, but the operation is 
not harmful. 

The transparency mode during a write (WEa = LOW) allows 
the data-in (Da) to not only be written into memory but also to 
appear at the output (Ya) when the output latch (LEa) is HIGH 
and the output enable control (OEa) is LOW. 

Extension To Four Read Ports and Two Write 
Ports 

A RAM with four read ports and two write ports can be made 
by using two dual access RAMs and connecting each of the 
write ports, write addresses, and write enables in parallel for 
the two devices. As an example, this RAM may provide data 
storage for a data ALU and an address adder as shown in 
Figure 3. A location should not be read tjefore it has been 
written into for the first time as the contents of the two dual 
access RAMs are likely to be different upon power-up. 

32 Words X 36 Bits Single-Access RAM 

It is possible to convert the 64 words x 18 bits dual-access 
RAM into a 32 word x 36 bit single-access RAM. This is done 
by storing the upper half of the 36 bits in the upper half of the 
64 words and addressing these from the A side. Then store 
the lower half of the 36 bits in the lower half of the 64 words 
and address these from the B side. This arrangement, which is 
shown in Figure 4, does not change the capacity of the RAM, 
but the dual access is lost. 
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Figure 1. Am29300 Family High-Performance System Block Diagram 
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Figure 2. Read through Ya and Write through Da in a Single Cycle (Two Bytes) 
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Figure 3. RAM with Four Read Ports and Two Write Ports 
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Figure 4. 32x36 RAM (Single Access) Using 64 x 18 Duai-Access RAM 



APPLICATIONS 

Suggestions for Power and Ground Pin 
Connections 

The Am29334 operates in an environment of fast signal rise 
times and substantial switching currents. Therefore, care must 
be exercised during circuit board design and layout, as with 
any high-performance component. The following is a sug- 
gested layout, but since systems vary widely in electrical 
configuration, an empirical evaluation of the intended layout is 
recommended. 

The VccT and GNDT pins, which carry output driver switching 
currents, tend to be electrically noisy. The VccE and GNDE 
pins, which supply the ECL core of the device, tend to produce 
less noise, and the circuits they supply may be adversely 
affected by noise spikes on the VccE plane. For this reason, it 
is best to provide isolation between the Voce and VccT pins, 
as well as independent decoupling for each. Isolating the 
GNDE and GNDT pins is not required. 



Printed Circuit Board Layout Suggestions 

1 . Use of a multi-layer PC board with separate power, ground, 
and signal planes is highly recommended. 

2. All VccE and VccT Pins should be connected to the Vcc 
plane. VccT Pins should be isolated from VccE Pins by means 
of a slot cut in the VccE plane; see Figure 5. By physically 
separating the V(x;e and VccT Pins, coupled noise will be 
reduced. 

3. All GNDE and GNDT pins should be connected directly to 
the ground plane. 

4. The VccT P'ns should be decoupled to ground with a 0.1 -;uF 
ceramic capacitor and a 10-mF electrolytic capacitor, placed 
as closely to the Am29334 as is practical. Vcge pins should 
be decoupled to ground in a similar manner. 

A suggested layout is shown in Figure 5. 
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O = Through Hole 

'' = Vcc Plane Connection 

c, = Cg = c 5 = 1 nF or greater (electrolytic 

or tantalum capacitor) 
C2 = C4 = Cg = o.inF or greater {ceramic or 

monolithic capacitor) 



Figure 5. Suggested Printed Circuit Board Layout 
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Parameter 


"cm 


*JA Still Air 
^JA 200 LFM 
^JA 600 LFM 
fjC Heat Sink 


15 
5 
3 



200 



400 



600 



AIR VELOCITY (LINEAR FEET PER MINUTE) 



Figure 6. Am29334 Thermal Characteristics (Typical) 
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ABSOLUTE MAXIMUM RATINGS 

Storage Temperature -65 to +150°C 

Temperature Under Bias - Tc -55 to +125°C 


OPERATING RANGES 

Commercial (C) Devices 

Temoerature CTrA n tn 


+ RR«n 


Supply Voltage to Ground Potential Supply Voltage +4.75 to +5.25 V 


Continuous ,..-0.5 to +7.0 V 

DC Voltage Applied to Outputs Operating ranges define those limits fcehveen which the 
for High State -0.5 V to +Vcc Max functionality of the device is guaranteed. 


Stresses above those listed under ABSOLUTE MAXIMUM 
RATINGS may cause permanent device failure. Functionality 
at or above these limits is not implied. Exposure to absolute 
maximum ratings for extended periods may affect device 

DC CHARACTERISTICS over operating range 


Parameter 
Symbol 


Parameter 
Description 


Test Conditions 

(Note 1) 


lUin. 


lUax. 


Unit 


VOH 


Output HIGH Voltage 


Vcc - Min. 

V|N - V|L or V|H 

l0H=-3 mA 


2.4 




Volts 


Vol 


Output LOW Voltage 


Vcc - Min. 

V|N = V|L or V|H 

IOL-16 mA 




0.5 


Volts 


V|H 


Input HIGH Level 


Guaranteed Input Logical 
HIGH Voltage for All Inputs 


2.0 




Volts 


V|L 


Input LOW Level 


Guaranteed Input Logical 
LOW Voltage for All Inputs 




0.8 


Volts 


V| 


Input Clamp Voltage 


Vcc - Min. 
IjN =-18 mA 




-1.2 


Volts 


l|L 


Input LOW Current 


Vcc - Max. 
V|N - 0.5 V 




-0.5 


mA 


l|H 


Input HIGH Current 


Vcc = Max. 
V|N - 2.4 V 




60 


mA 


l| 


Input HIGH Current 


Vcc = Max. 
V|N - 5.5 V 




1.0 


mA 


lOZH 
lOZL 


Off-State (High-Impedance) 
Output Current 


Vcc = Max. 


Vo - 2.4 V 




50 


M 


Vo-0.5 V 




-50 


isc 


Output Short-Circuit Current 
(Note 2) 


Vcc -Max. to +0.5 V 
Vo - 0.6 V 


-15 


-50 


mA 


Ice 


Power Supply Current 
(Note 3) 


Vcc = Max 


COM'L Only 


To-0 to +85X 




950 


mA 


Tc=+85°C 




820 


MIL Only 


TC--55 to +125°C 






Tc - + 125=C 






Notes: 1. For conditions shown as Mm. or Max., use the appropriate value speoitied under Operating Ranges (or ttie applicable device 
type. 

2. Not more than one output should be shorted at a time. Duration of the short-circuit test should not exceed one second 

3. Measured with all inputs HIGH. 

4. Recommended air velocity is 200 linear feet per minute. 

' ■ — ___ 1 
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SWITCHING CHARACTERISTICS over operating range (Note 1) 



No. 



Parameter 



Description 



Test Conditions 



Max. Delay 



Unit 



Access Time 



Ara or Arb to 



LEa or LEb = H 



24 



Turn-On Time 



OEa or OEb i to Ya or Yb 
Active 



20 



Turn-Off Time (Note 2) 



OEa or OEb t to Ya or 
Yb = Higti Impedance 



Cl = 5 pF load 



16 



Enable Time 



LEa or LEb f to Ya or Yb 



Transparency 



WEa or WEb 1 to Ya or Yb 



LEa or LEb = H 



Transparency 



Da or Db to Ya or Yb 



LEa or LEb = H, 
WEa or WEb = L 



16 



32 



33 



Data Setup Time 



Da or Db to WEa or WEb t 



10 



Data Hold Time 



Da or Db to WEa or WEb t 



Address Setup Time 



AwA or Awe to WEa or WEg i 



Address Hold Time 



AwA or AwB to WEa or WEb t 



ns 
ns 



Address Setup Time 



Ara or Arb to LEa or LEb * 



12 



Address Hold Time 



Ara or Arb to LEa or LEb * 



ns 
ns 



13 



Latch Close Before 
Write 



LEa or LEb i to WEa or WEb I 



14 



Write Pulse Width 



WEa or WEb (LOW) 



15 



Latch Data Capture 
Pulse Width 



LEa or LEb (HIGH) 



10 



Notes: 1. WEa = WEac + WEal/H 
WEb = WEbc + WEbl/H 
2. Ya and Yb are tested independently. 
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SWITCHING TEST CIRCUIT 



VouT Of^Q- 



-w- 



Cl It > IK 



TC003420 



Three-State Outputs 

Notes: 1. Cl = 50 pF includes scope probe, wiring and stray capacitances witliout device in test fixture. 

2. Si, S2, S3 are closed during functions tests and all AC tests except output enable tests. 

3. Si and S3 are closed wfiile S2 is open for tpzn test. 
Si and $2 are closed while S3 is open for tpzL test. 

4. Cl = 5.0 pF for output disable tests. 



SWITCHING WAVEFORMS 




.A 1^ 



\ f 



INPUT 
TO _ 
CLOCK OUTPUT 
TO. DELAY 
OUTPUT 
DELAY 
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SWITCHING WAVEFORMS fCont'd.: 



W 



\ 



* — @ 



w^ r^^^xwmmmmmmm^ 






•^ (D- 



\ 



/ 



:■« ©- 



-©- 



l^ML 



-)mL 



Read Function (same for B Port) 



X 



-®- 



jT 



® : < »■ ! 



W 



wmm 



;-*-® 



SfflMUZXfflSM 



Write Function (same for B Port) 



\ 



/ 



-©- 



;-«-©-»•; 



mmmm 



-®- 



Note; LEa = HIGH 
OEa = LOW 



Transparency Function (same for B Port) 



WF023510 
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INPUT/OUTPUT CIRCUIT DIAGRAM 



DRIVING OUTPUT 




DRIVEN INPUT 
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Am29434 

ECL Four-Port, Dual-Access Register File 



PRELIMINARY 



DISTIKCTIVE CHARACTERISTICS 



• Fast 



With an access time of 20 ns, the Am29434 supports 

50-60 ns microcycle time when used with the Am29400 

Family for 32-bit systems. 

64 X 18 Bits Wide Register File 

The Am29434 is a high-performance, high-speed, duai- 

access RAM with two READ ports and two WRiTE 

ports. 

Cascadable 

The Am29434 Is cascadable to support either wider 

word widths, deeper register files, or both. 



Simplified Timing Control 

Control for write enable timing and for on-chip read/ 
write address multiplexer are derived from a single- 
phase clock input. 
Byte Parity Storage 

Width of 18 bits facilitates byte parity storage for each 
port and provides consistency with the Am29432 32-bit 
ALU. 

Byte Write Capability 

Individual byte-write enables allows byte or full word 
write. 



M 



GENERAL DESCRIPTION 



The Am29434 is a 64-word deep and 18-bit wide dual- 
access register file designed to support other members of 
the Am29400 Family by providing high-speed storage. It 
has two write and two read ports for data and four 6-bit 
address ports. Two address ports are associated with each 
pair of read and write data ports, one to read data and the 
other to write. The device is capable of performing two 
reads and two writes in one cycle. The 1 8-bit wide register 



file allows storage of byte parity to support parity check and 
generate in the Am29432 32-bit ALU. Independent control 
for each read and write data port allows the Am29434 to be 
used as a high-speed shared memory or as a mailbox for a 
multiprocessor system. The device is designed with an 
access time of 20 ns. It is housed in a 120-lead pin grid 
array package. 
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Publication # Rev. Amendment 

0S554 A /O 
Issue Date: September 1986 









CONNECTION DIAGRAM 














120-Lead PGA* 
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•Pinoul observed from pin side of package. 














TABLE OF INTERCONNECTIONS 














(Sorted by Pin No.) 








PIN NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


- 


- 


99 


C-6 


YB5 


115 


H-2 


Daio 


10 


M-5 


Ya4 


80 


- 


- 


97 


C-6 


Vcco 


113 


H-3 


Vcc 


68 


M-6 


Ya6 


81 


- 


- 


39 


C-7 


^^ 


52 


H-11 


Obis 


34 


M-7 


Ya8 


82 


- 


- 


37 


C-8 


53 


H-12 


Dai2 


95 


M-8 


Yaii 


25 


A-1 


AWA2 


1 


C-9 


Yb14 


109 


H-13 


DB14 


94 


M-9 


Yai 3 


86 


A-2 


ArA3 


120 


C-10 


YB17 


48 


J-1 


Dai 2 


11 


M-10 


Yai 5 


87 


A-3 


AWA4 


59 


C-11 


Dbi 


44 


J-Z 


Da13 


71 


M-11 


ArB3 


89 


A-4 


Ybi 


58 


C-12 


Dbg 


104 


J-3 


Daii 


70 


M-12 


AWB2 


30 


A-S 


Vcco 


56 


C-1 3 


Db7 


41 


J-11 


Vee 


38 


M-13 


Arbi 

WfcAL 


91 


A-6 


YB7 


114 


D-1 


Dao 


4 


J-12 


Vee 


38 


N-1 


16 


A-7 


Yb8 


54 


D-2 


Arao 


63 


J-13 


Vee 


38 


N-2 


WEaH 


76 


A-B 


Yb12 


51 


D-3 


AwAO 


3 


K-1 


Da16 


13 


N-3 


AWB4 


17 


A-9 


Vcco 


50 


D-11 


DB4 


102 


K-2 


Dais 


72 


N-4 


Ya2 


19 


A-10 


Yb15 
WEbl 


49 


D-12 


DB3 


43 


K-3 


Dai 4 


12 


N-5 


Vcco 


20 


A-11 


47 


D-13 


Dbz 


103 


K-11 


Arbo 


92 


N-6 


Yas 


21 


A-1 2 


WEbc 


106 


E-1 


Da2 


5 


K-12 


Dbi 7 


33 


N-7 


Ya9 


24 


A-13 


Awes 


46 


E-2 


DA3 


65 


K-13 


Dbi 6 


93 


N-8 


Yaio 


84 


B-1 


ARA2 


61 


E-3 


Dai 


64 


L-1 


LEa 


14 


N-9 


Vcco 


26 


B-2 


AWA3 


60 


E-11 


vcc 


98 


L-2 


Arab 


74 


N-10 


Yai 6 


28 


B-3 


ArA4 


119 


E-12 


Vcc 


98 


L-3 


Dai 7 


73 


N-11 


AWB3 


29 


B-4 


Yb2 


117 


E-13 


Vcc 


98 


L-4 


Yao 


18 


N-12 


ArB2 


90 


B-5 
B-6 


YB4 
YB6 


116 


F-1 


Da4 


6 


L-5 


^A 


79 


N-13 


AwBI 


31 


55 


F-2 


Da6 


66 


L-6 


23 








B-7 


Yb9 


112 


F-3 


Vee 


8 


L-7 


Ya7 


22 








8-S 


Ybii 


111 


F-11 


Db8 


100 


L-8 


Vcco 


83 








B-9 


Ybi 3 


110 


F-12 


DBS 


42 


L-9 


Ya12 


85 








B-10 


Yfi]6 


108 


F-13 


DB6 


101 


L-10 


Yau 


27 








B-11 


webh 


107 


G-1 


Das 


9 


L-11 


Ya17 


88 








B-12 


lEb 


45 


G-2 


Da7 


67 


L-12 


AWBO 


32 








B-13 


ArB6 


105 


G-3 


Da6 


7 


L-13 


Dbi 3 


35 








C-1 


AWAI 


2 


G-11 


DB9 


40 


M-1 


WEac 


75 








C-2 


Arai 


62 


G-12 


Dbii 


36 


M-2 


AWAS 


15 








C-3 


Ybo 


118 


G-13 


Dbio 


96 


M-3 


ArB4 


77 








C-4 


Yb3 


57 


H-1 


Da9 


69 


M-4 


Yai 


78 






i 
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TABLE OF INTERCONNECTIONS 

(Sorted By Pin Name) 



PIN 
NAME 



PIN NO. 



PAD 
NO. 



PIN 
NAME 



PIN NO. 



PAD 
NO. 



PIN 
NAME 



PIN NO. 



PAD 
NO. 



PIN 
NAME 



PIN NO. 



PAD 
NO. 



Arao 
Arai 

AbA2 
ArA3 
ArA4 
ArA5 

Arbo 
Arbi 

ArB2 
ArB3 
ArB4 

Arbs 

AwAO 
AwAI 
AWA2 
AWA3 
AWA4 
AWAS 
AWBO 
AwB1 
AWB2 
AWB3 
AWB4 
AWB5 

Dao 
Dai 

Da2 
Da3 



D-2 

C-2 

N-13 

A-2 

B-3 

L-2 

K-11 

M-13 

N-12 

M-11 

M-3 

B-13 

D-3 

C-1 

A-1 

B-2 

A-3 

M-2 

L-12 

N-13 

M-12 

N-11 

N-3 

A-1 3 

D-1 

E-3 

E-1 

E-2 



99 

97 

39 

37 

63 

62 

61 

120 

119 

74 

92 

91 

90 

89 

77 

105 

3 

2 

1 

60 

59 

15 

32 

31 

30 

29 

17 

46 

4 

64 

5 

65 



Da4 

Das 

Da6 

Da7 
Das 

DA9 

Daio 
Daii 
Dai2 

Da13 

Da14 

Da15 

Da16 

Da17 

Dbo 

Dbi 

DB2 

Db3 

Db4 

DBS 

Db6 

Db7 

DBS 

Db9 

Dbio 

Dbii 

Db12 
Db13 
Db14 

Dbis 

Db16 
Db17 



F-1 

F-2 

G-3 

G-2 

G-1 

H-1 

H-2 

J-3 

J-1 

J-2 

K-3 

K-2 

K-1 

L-2 

C-12 

C-11 

D-1 3 

D-1 2 

D-11 

F-1 2 

F-1 3 

C-1 3 

F-11 

G-11 

G-1 3 

G-1 2 

H-1 2 

L-13 

H-1 3 

H-11 

K-1 3 

K-1 2 



6 
66 
7 
67 
9 
69 
10 
70 
11 
71 
12 
72 
13 
73 

104 
44 

103 
43 

102 
42 

101 
41 

100 
40 
96 
36 
95 
35 
94 
34 
93 
33 



lea 

^A 

5Eb 
vcc 
vcc 



Vcco 
Vcco 

Vcco 

Vcco 

Vcco 
Vcco 
Vee 
Vee 



weac 
weah 
weal 
webc 

WEbh 

webl 

Yao 

Yai 

YA2 
YA3 
YA4 
YA5 
Ya$ 
YA7 



L-1 

B-12 

L-6 

C-8 

H-3 

E-11, 

E-1 2, 

E-1 3 

N-5 

N-9 

A-9 

A-5 

L-8 

C-6 

F-3 

J-11, 

J-1 2, 

J-1 3 

M-1 

N-2 

N-1 

A-1 2 

B-11 

A-11 

L-4 

M-4 

N-4 

L-5 

M-5 

N-6 

M-6 

L-7 



14 
45 
23 
53 
68 



20 
26 
50 
56 
83 
113 
8 
38 



75 
76 
16 
106 
107 
47 
18 
78 
19 
79 
80 
21 
81 
22 



Yas 

Ya9 

Yaio 
Yah 
Yai2 

Ya13 
Ya14 

Yais 
Yai$ 

YA17 

Ybo 
Ybi 

Yb2 
Yb3 
Yb4 

Ybs 

YB6 
Yb7 

Ybs 
Yb9 
Ybio 
Ybii 

Yb12 
Yb13 
Yb14 
Yb15 

Ybis 

Yb17 



M-7 

N-7 

N-8 

M-8 

L-9 

M-9 

L-10 

M-10 

N-10 

L-11 

C-3 

A-4 

B-4 

OA 

B-5 

C-5 

B-6 

A-6 

A-7 

B-7 

C-7 

B-8 

A-8 

B-9 

C-9 

A-10 

B-10 

C-10 



82 
24 
84 
25 
85 
86 
27 
87 
28 
88 

118 
58 

117 
57 

1l6 

115 
55 

114 
54 

112 
52 

111 
51 

110 

109 
49 

108 
48 
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TABLE OF INTERCONNECTIONS 






(Sorted by Pad No.) 




PIN 


PAD 


PIN 


PIN 


PAD 


PIN 


PIN 


PAD 


PIN 


PIN PAD PIN 


NAME 


NUMBER 


NUMBER 


NAME 


NUMBER 


NUMBER 


NAME 


NUMBER 


NUMBER 


NAME NUMBER NUMBER 


AWA2 


1 


A-1 


AWBI 


31 


N-1 3 


ArA2 


61 


B-1 


Arbi 


91 M-1 3 


AWAI 


2 


C-1 


AWBO 


32 


L-1 2 


Arai 


62 


C-2 


Arbo 


92 K-11 


AWAO 


3 


D-3 


Db17 


33 


K-1 2 


Arao 


63 


D-2 


Dbi6 


93 K-13 


Oao 


4 


D-1 


Db15 


34 


H-11 


Dai 


64 


E-3 


Db14 


94 H-1 3 


Da2 


5 


E-1 


Db13 


3S 


L-1 3 


Da3 


65 


E-2 


Dbi2 


95 H-1 2 


Da4 


6 


F-1 


Dbii 


36 


G-12 


Da6 


66 


F-2 


Dbio 


96 G-1 3 


Da6 


7 


G-3 




37 


- 


DA7 


67 


G-2 




97 


Vee 


8 


F-3 


Vee 


38 


J-11, J-12, J-13 


Vcc 


68 


H-3 


Vcc 


98 E-11,E-12,E-13 


Da8 


9 


G-1 




39 


- 


DA9 


69 


H-1 




99 


Daio 


10 


H-2 


Db9 


40 


G-11 


Daii 


70 


J-3 


Db8 


100 F-11 


Dai 2 


11 


J-1 


Db7 


41 


C-1 3 


Dai 3 


71 


J-2 


Db6 


101 F-13 


Da14 


12 


K-3 


DB5 


42 


F-12 


Da15 


72 


K-2 


Db4 


102 D-11 


Dai 6 


13 


K-1 


Db3 


43 


D-12 


Dai 7 


73 


L-3 


Db2 


103 D-13 


LEa 


14 


L-1 


Db1 


44 


C-11 




74 


L-2 


Dbo 


04 C-1 2 


WEal 


IS 


M-2 


LEb 


45 


B-12 


75 


M-1 


ArB5 
WEbc 


05 B-1 3 


16 


N-1 


WEbl 


46 


A-1 3 


WEah 


76 


N-2 


06 A-1 2 


AWB4 


17 


N-3 


47 


A-11 


ArB4 


77 


M-3 


WEbh 


07 B-1 1 


Yao 


18 


L-4 


YB17 


48 


C-10 


Yai 


78 


M-4 


Ybi6 


08 B-10 


Ya2 


19 


N-4 


Yb15 


49 


A-10 


Ya3 


79 


L-5 


Yb14 


09 C-9 


VCCO 


20 


N-5 


Vcco 


SO 


A-9 


YA4 


80 


M-5 


Yb13 


10 B-9 


YA5 


21 


N-6 


Yb12 


51 


A-8 


Ya6 


81 


M-6 


Yb11 


11 B-8 


^; 


22 


L-7 


^l" 


52 


C-7 


Ya8 


82 


M-7 


Yb9 


12 B-7 


23 


L.6 


S3 


M 


Vcco 


83 


L-8 


Vcco 


13 C-6 


Ya9 


24 


N-7 


Yb8 


54 


A-7 


Yaio 


84 


N-8 


Yb7 


14 A-6 


Yah 


2S 


M-8 


YB6 


55 


B-6 


Yaiz 


85 
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Notes: 


1. Vex; is the most positive power supply voltage for internal chip logic. 

2. Vcco is the most positive power supply for output buffers. 

3. Vee is the most negative power supply for all logic. 

4. Pins E-11, E-12. and E-13 are physically shorted together in the pacl^age. 

5. Pins J-11, J-12, and J-13 are physically shorted together in the package. 
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ORDERING !NFORMAT!ON 

Standard Products 

AMD standard products are available in several packages and operating ranges. The order number (Valid Combination) Is 
formed by a combination of: A. Device Number 

B. Speed Option (if applicable) 

C. Package Type 

D. Temperature Range 

E. Optional Processing 



-E. OPTIONAL PROCESSING 

Blank = Standard processing 
B = Burn-in 



- D. TEMPERATURE RANGE 

C - Commercial (0 to + 70°C) 



-C. PACKAGE TYPE 

G = 120-Pin Pin Grid An-ay (CG 120*) 



-B. SPEED OPTION 

Not Applicable 



-A. DEVICE NUMBER/DESCRIPTION (Include revision letter) 
Am29434 ECL Four-Port, Dual-Access Register File 



Preliminary. Subject to Change. 



Valid Combinations 



GO, GOB 



Valid Combinations 

Valid Combinations list configurations planned to be 
supported in volume for this device. Consult the local AMD 
sales office to confirm availability of specific valid 
combinations, to check on newly released valid combinations, 
and to obtain additional data on AMD's standard military 
grade products. 
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PIN DESCRIPTION 



Arao-Aras Addresses (Inputs, Active HIGH) 

The 6-bit field presented at the ARa inputs selects one of 64 
memory words for presentation to the Ya Data Latch. 

ArbO'Arbs Addresses (Inputs, Active HIGH) 

The six-bit field presented at the ARb inputs selects one of 
64 memory words for presentation to the Yb Data Latch. 

Yao-Yai7 Data Latch (Outputs) 

The 18-bit Ya Data Latch Outputs. 

Ybo-Ybi7 Data Latch (Outputs) 

The 18-bit Yb Data Latch Outputs. 

Awao~AwA5 Addresses (Inputs, Active HIGH) 

The six-bit field presented at the AWa inputs selects one of 
64 words for writing new data from the Da inputs. 

AwBO-AwBS Addresses (Inputs, Active HIGH) 

The six-bit field presented at the AWg inputs selects one of 
64 words for writing new data from the Db inputs. 

Dao-Dai7 Data (Inputs, Active HIGH) 

New data is written Into the word, selected by the AWa 

address inputs, through these inputs. 

Dbo-Dbi7 Data (Inputs, Active HIGH) 

New data is written into the word, selected by the AWb 
address Inputs, through these Inputs. 

LEa Ya Data Latch Enable (Input) 

The LEa input controls the Latch for the Ya output port. 
When LEa is HIGH, the latch is open (transparent) and data 
from the RAM, as selected by the ARa address inputs, is 
present at the Ya outputs. When LEa is LOW, the Latch is 
closed and it retains the last data read from the RAM 
selected by the ARa address Inputs. 

LEb Yb Data Latch Enable (Input) 

The LEb input controls the Latch for the Yb output port. 
When LEb is HIGH, the Latch is open (transparent) and data 
from the RAM, as selected by the ARb address inputs, is 
present at the Yb outputs. When LEb is LOW, the Latch is 
closed and it retains the last data read from the RAM 
selected by the ARb address inputs. 

OIa Ya Output Enable (Input, Acthre LOW) 

When OEa is LOW, data In the Ya Data Latch is present at 
the Ya outputs. If OEa is HIGH, Ya outputs are in the LOW 
logic (off) state. 

OEb Yb Output Enable (Input, Active LOW) 

When 5Eb is LOW, data in the Yb Data Latch Is present at 
the Yb outputs. If OEb is HIGH, Yb outputs are In the LOW 
logic (off) state. 



WEac Write Enable (Input, Active LOW) 

When WEac is LOW together with WEah and WEal. new 
data is written into the word selected by the AWa address 
Inputs. When WEac is HIGH, no data is written into the RAM 
through the A port. 

WEbc _Write Enable (Input, ActiveJ-OW) 

When WEbc is LOW together with WEbh and WEbl. new 
data is written into the word selected by the AWb address 
inputs. When WEbc is HIGH, no data is written into the RAM 
through the B port. 

WEah High-Byte Write Enable (Input, Active LOW) 

When WEah is LOW together with WEac. new data is 
written into the high byte of the word selected by the AWa 
address inputs. When WEah is HIGH, no data Is written Into 
the high byte of the word selected by the AWa address 
inputs. 

WEbh High-Byte Write Enable (Input^ Active LOW) 

When WEbh is LOW together with WEbc. new data is 
written into the high byte of the word selected by the AWb 
address inputs. When WEbh is HIGH, no data is written into 
the high byte of the word selected by the AWb address 
Inputs. 

WEal Low-Byte Write Enable (Input, Active LOW) 

When WEal is LOW together with WEac. new data is 
written into the low byte of the word selected by the AWa 
address inputs. When WEal is HIGH, no data is written into 
the low byte of the word selected by the AWa address 
inputs. 

WEbl Low-Byte Write Enable (Input. Active LOW) 

When WEbl is LOW together with WEbc. new data is 
written into the low byte of the word selected by the AWb 
address inputs. When WEbl is HIGH, no data is written into 
the low byte of the word selected by the AWb address 
inputs. 

Vcc Internal Logic Ground 

This is the most positive voltage in the internal logic. It is 
used as the reference level for internal logic. 

Vcco Out Drive Ground 

This is the most positive voltage in the output buffer logic. It 
is used as the reference level for the buffer logic. 

Vee Power Supply Volatge 

This is the most negative voltage. It provides power for 
internal and buffer logic. 
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FUNCTIONAL DESCRIPTION 

The part has two read ports (Yao-Yai7, Ybo-Ybi?), two 
write ports (Dao-Dai7. Dbo-Dbi7), four addresses 
(AhA0-Ara5i AwaO-AwaS. Arbo-Arb5. Awbo - AwBs). 
two latch enables (LEa, LEb), two output enables (5^a. OEb), 
and six write enables (WEac. WEal. WEah> WEbc, WEbl. 
WEbh) that allow writing of data into one or txjth bytes of a 
word. The separate read and write addresses facilitate cre- 
ation of three- and four-address architectures and allow 
address set-up and RAM access to overlap. 

Since the A and B sides are identical, only operation of the A 
side is described. The address multiplexer provides the RAM 
with the address Ara when WEac = HIGH and with the 
address Awa when WEac = LOW. Internally the part Is 
designed so that there is no race condition between the write 
address and the write enable. In most cases WEac and LEa 
will be connected to the clock as shown in Figure 2 so that 
reading will take place in the first part of a clock cycle and 
writing in the last part. The latch at the output of the RAM is 
transparent when LEa = HIGH and retains the data when 
LEa = LOW. The latch has an output Ya controlled by OEa. 
Each word is split into two bytes of nine bits that can be 
individually written. The low byte covers bits through 8 and 
the high byte covers bits 9 through 17. One or both bytes of 
the data at Da are written into the location given by Awa when 
the common write enable (WEac) and the appropriate byte 
write enables (WEal and WEah) are active. Two special 
cases arise. First, if a location is written into and read at the 



same time, the value read is the value being written. Second, if 
a location is written into from both the A side and the B side, 
the value written is undefined, but the operation is not harmful. 

The transparency mode during a write (WEa = LOW) allows 
the data-in (Da) to not only be written into memory but also to 
appear at the output (Ya) when the output latch (LEa) is HIGH 
and the output enable control (OEa) is LOW. 

Extension To Four Read Ports and Two Write 
Ports 

A RAM with four read ports and two write ports can be made 
by using two dual access RAMs and connecting each of the 
write ports, write addresses, and write enables in parallel for 
the two devices. As an example, this RAM may provide data 
storage for a data ALU and an address adder as shown in 
Figure 3. A location should not be read before it has been 
written into for the first time as the contents of the two dual 
access RAMs are likely to be different upon power-up. 

32 Words x 36 Bits Single Access Ram 

It is possible to convert the 64 word x 18-bit dual-access RAM 
into a 32 word X 36-bit single-access RAM. This is done by 
storing the upper half of the 36 bits in the upper half of the 64 
words and addressing them from the A side. The lower half of 
the 36 bits should then be stored in the lower half of the 64 
words and addressed from the B side. This arrangement, 
which is shown in Figure 4, does not change the capacity of 
the RAM, but the dual access is lost. 
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Figure 1. Ani29400 Family High-Performance System Block Diagram 
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Figure 2. Read through Ya and Write through Da in a Single Cycle (Two Bytes) 
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Figure 3. RAM with 4 Read Ports and 2 Write Ports 
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Figure 4. 32x36 RAM (Single Access) Using 64 x 18 Dual Access RAIM 



APPLICATIONS 



Suggested Printed Circuit Board Layout 
Bottom View 



ABCOEFGHJK 



VcCO 




,VcCO 



Connect VccxD Directly to 
Plane. 



Vee 



AF004151 

Connect Vcc & Vee Directly to Plane from E-13 and J-13. 
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ABSOLUTE MAXIMUM RA 

Storaae TemD©ratur© 


TINGS 

-65to+150''C Commercial (C 

Temperatur 

-55to+125°C Supply Vol 

7.0 V to +0.5 V Air Velocity 


OPERATING RANGES 

y) Devices 






Ambient Temperature with 


e to +75°C 

tage -5.46 V to -4.94 V 


Vee Pin Pote 
Input Voltage 
Output Curren 

Stresses atov 
RATINGS may 
at or above th 
maximum rati 
reliability. 

DC CHARi 


itial to GND Pin - 


200 linear feet per minute 


(DC) 


Vcc to +0 5 V 


t (DC Output HIGH).... -30 mA to +0.1 mA Operating ranges define those limits between which the 
>e those listed under ABSOLUTE MAXIMUM '""Otionality of the device is guaranteed. 
' cause permanent device failure. Functionality 
ese limits is not implied. Exposure to absolute 
Tgs for extended periods may affect device 

\CTERISTICS (Commercial) (Notes 1 and 2) 


Parameter 
Symbol 


Parameter 
Description 


Test Conditions 

(Note 5) 


Ta 


Min. 

(Note 3) 


Typ. 

(Note 1) 


Max. 

(Note 3) 


Units 


VOH 


Output Voltage HIGH 


V|N - V|H Max. or V|l Min. 


O'C 


-1000 




-840 


mV 


+ 25°C 


-960 




-810 


+ 75°C 


-900 




-720 


Vol 


Output Voltage LOW 


CC 


-1970 




-166S 


mV 


+ 25''C 


iip-iflso 




-1650 


+ 7S!B^; 


-two 




-1625 


Vqhc 


Output Voltage HIGH 


V|N - ViH Mm or Viu W*. * t %-'}, ■ 


0*v'S* 


iv-ioa 






mV 


*«5%. 


•-980 






•^fs-c 


-920 






VOLC 


Output Voltage LOW 


OX 






-1645 


mV 


+ 25°C 






-1630 


+ 75"C 






-1605 


VlH 


Input Voltage HIGH 

^.t, .1,,, 


'••i "V % " 

^uaranMid loptA Voltage HIGH tor 


0"C 


-1145 




-840 


mV 


+ 25°C 


-1105 




-810 


+ 76°C 


-1045 




-720 


V|L 


Input Voltage L0% -*' ' 


k '• 

Guaranteed Input Voltage LOW for 
All Inputs 


0°C 


-1870 




-1490 


mV 


+ 25°C 


-1850 




-1475 


+ 75°C 


-1830 




-1450 


llH 


Input Cun«nt HIGH 


V|N = V|H Max. 


to 
+ 75°C 






220 


luA 


l|L 


Input Cun^ent LOW 


V|N-V|LMin. 


+ 26'C 






140 


ma 


lEE 




All Inputs and Outputs Open 


o-c 






950 


mA 


+ 75°C 






850 


Notes: 1. Typical values are: 

Vee - -5.2 V, Vcc = gnd, Vcco - gnd 

Output Load - 50 n and 30 pF to -2.0 V. 

2. Guaranteed witti transverse air flow exceeding 200 linear F.P.M. and 2-minute wann-up period. Typical thermal resistance values of the 
pacloge are: 

SjA (Junction-to-Amblent) - 22"C/Watt (still air) 

SjA (Junction-to-Ambient) = 7.6'C/Watt (at 200 F.P.M. air flow) 

9jC (Junction-to-Case) => 5°C/Watt 

3. These are absolute voltages with respect to device ground pin and include all overshoots due to system and/or tester noise. Do not 
attempt to test these values without suitable equipment 
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SWITCHING CHARACTERISTICS (Commercial Only) 


No. 


Parameters 


From 


To 


Test Conditions 


Time (ns) 


1 


Access Time 


Ara or Arb 


Va or Yb 


LEa or LEb = H 


20 


2 


Turn-On Time 


SEa or OEb = L 


Ya or Yb 




10 


3 


Tum-Off Time 


5Ea or 5Eb = H 


YA0rYB = L 




10 


4 


Enable Time 


LEa or LEb = H 


Ya or Yb 




13 


5 


Transparency 


WEa or WEb = L 


YaOtYb 


LEa or LEb - H 


28 


6 


Transparency 


Da or Db 


YaotYb 


LEa or LEb = H 
WEa- or VvEb - L 


29 


Minimum Setup and Hold Time 


No. 


Parameters 


For 


_,,f,,1is :mT 


Time (ns) 


7 


Data Setup 


Da or Db 


vs^S^^Mjo H) 


9 


3 


Data Hold 


Da or Db 


#i(%;#B '(L TO H) 


2 


9 


Address Setup 


AwA or Awb 


S^sfer WEb (H to L) 





10 


Address Hold 


AwA or Awb 


S^Ea or WEb (L to H) 


3 


11 


Address Setup 


Aha or Arb 


LEa or LEb (H TO L) 


7 


12 


Address Hold 


AflA or Arb 


LEa or LEb (H TO L) 


4 


13 


l^tch close 
before Write 


■tEA or LEb- 
<H TO L) 


WEa or WEb (h to g 





Minimum Puise Widths 


No. 


Parameters 


Input 


Pulse 


Time (ns) 


14 


Write Pulse 


WEa or WEb 


HIGH -low -HIGH 


18 


15 


Latch Data Capture 


LEa or LEb 


LOW -HIGH -LOW 


10 


WEa = WEac • (WEal + WEah) **Ya and Yb Are Tested Inriepfindently 
web = webc»(wEbl + webh) 
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KEY TO SWrrCHING WAVEFORMS 



WAVEFORM INPUTS 


OUTPUTS 

WILL BE 
STEADY 

WILL BE 
CHANGING 
FROM H TO L 

WILL BE 
CHANGING 
FROM L TO H 

CHANGING; 

STATE 

LMKNOWN 

CENTER 
LINE IS HIGH 
IMPEDANCf 
10FF- STATE 


MUST BE 
STEADY 
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Read Function (same for B Port) 
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SWITCHING WAVEFORMS (Cont'd.) 
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Write Function (same for B Port) 



3-101 



I/O CURRENT INTERFACE DIAGRAM 
INPUT CIRCUIT 
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Am29325 

32-Bit Floating-Point Processor 






DISTINCTIVE CHARACTERISTICS 



Single VLSI device performs high-speed floating-point 
arithmetic 

- Floating-point addition, subtraction, and multiplication 
in a single clock cycle 

- Internal architecture supports sum-of-products, 
Newton-Raphson division 

32-bit, three-bus flow-through architecture 

- Programmable I/O allows interface to 32- and 16-bit 
systems 



• IEEE and DEC formats 

- Performs conversions between formats 

- Performs integer •«-»• floating-point conversions 

• Six flags indicate operation status 

• Register enables eliminate clock skew 

• Input and output registers can be made transparent 
independently 
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GENERAL DESCRIPTION 



The Am29325 is a high-speed floating-point processor unit. 
It performs 32-bit single-precision floating-point addition, 
subtraction, and multiplication operations in a single VLSI 
circuit, using the format specified by the proposed IEEE 
floating-point standard, P754. The DEC single-precision 
floating-point format Is also supported. Operations for 
conversion between 32-bit integer format and floating-point 
format are available, as are operations for converting 
between the IEEE and DEC floating-point formats. Any 
operation can be performed in a single clock cycle. Six 
flags — invalid operation, inexact result, zero, not-a-num- 
ber, overflow, and underflow — monitor the status of opera- 
tions. 

The Am29325 has a three-bus, 32-bit architecture, with two 
input buses and one output bus. This configuration provides 



high I/O bandwidth, allows access to all buses and affords 
a high degree of flexibility when connecting this device in a 
system. All buses are registered with each register having a 
clock enable. Input and output registers may t>e made 
transparent independently. Two other I/O configurations, a 
32-bit, two-bus architecture and a 16-bit, three-bus archi- 
tecture, are user-selectable, easing interface with a wide 
variety of systems. Thirty-two-bit internal feedforward data- 
paths support accumulation operations, including sum-of- 
products and Newton-Raphson division. 

Fabricated with the high-speed IMOX^^ bipolar process, 
the Am29325 is powered by a single 5-volt supply. The 
device is housed in a 145-terminal pin-grid-array package. 
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05621 D /O 
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Am29337 



16-Bit Bounds Checker 



^ 



DISTINCTIVE CHARACTERISTICS 



Double Comparator 

- Compares a 16-bit input number with a lower limit and 
an upper limit 

Cascadable 

- 16-bit cascadable to longer words 



• Out-of-Bounds Flag 

- Flags values that are outside the bounds of a lower 
and an upper limit 

• Compares Signed or Unsigned Numbers 

• 28-Pln Packages 



w 



GENERAL DESCRIPTION 



The Am29337 is the 16-bit bounds checker that compares 
a 16-brt signed or unsigned number with a lower and an 
upper limit stored In the registers. The part flags values that 



are out of bounds, or triggers a counter used to count the 
number of values that lie within the given range. 

The Am29337 is cascadable up to 32 bits or greater. 
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RELATED AMD PRODUCTS 



Part No. 


Description 


Am2900 


Bipolar Bit-Slice Family 


Am29C00 


CMOS Bit-Siice Family 


Ara29112 


Bipolar 8-Bit Cascadable Microprogram Sequencer 


AmZ9114 


Bipolar Interrupt Ckintroller 


Am29116 


Bipolar 16-Bit Microprogrammable Controller 


Am29C116 


CMOS 16-Bit Microprogrammable Controller 


Am29117 


Bipolar 16-Bit Two-Port Microprogrammable Controller 


Am29C117 


CMOS 16-Bit Two-Port Microprogrammable Controller 


Am29C323 


CMOS 32 X 32 Multiplier 


Am29325 


Bipolar 32-Bit Floating Point Processor 


Am29C325 


CMOS 32-Bit Floating Point Processor 


Am29331 


Bipolar 16-Bit Microprogram Sequencer 


Am29C331 


CMOS 16-Bit Microprogram Sequencer 


Am29332 


Bipolar 32-Bit Non-Cascadable ALU 


Am29C332 


CMOS 32-Bit Non-Cascadable ALU 


Ani29334 


Bipolar 64x18 Four-Port Dual-Access Register RIe 


Am29C334 


CMOS 64x18 Four-Port Dual-Access Register File 



CONNECTION DIAGRAM 
Top View 



CHsCi 

DwC 2 
D13C 3 

COuC 5 

ooeC 6 

GNDC 7 

NCC 8 

COi_ C 9 

DoC 10 

DiC 11 

D2IZ 12 

D3C 13 

ClyC 14 



23 Zl SIGNED 
27Z1D11 
26l]DlO 

25 I3D9 

24 DDs 
23 I]CP 
22 ZIVcc 
21 I]EN|. 
20 ^ENu 
I9DD4 
18 DDs 
17 Z]D6 
16 IDD7 
ISPCIL 

CD010100 



Note: Pin 1 is marked for orientation. 



LOGIC SYMBOL 




LS002810 



METALLIZATION AND PAD LAYOUT 




Die Size: 117x143 
Gate Count: 250 
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ORDERING INFORMATION 
Standard Products 

AMD standard products are available in several packages and operating ranges. The order numtser (Valid Combination) is formed by 
a combination of: a. Device Number 

b. Speed Option (if applicable) 

c Package Type 

d. Temperature Range 

e. Optional Processing 



AM29337 



B 



-a. DEVICE NUMBER/DESCRIPTION 

Am29337 

16-Bit Bounds Checker 



- e. OPTIONAL PROCESSING 

Blank » Standard processing 
B = Bum-in 



-d. TEMPERATURE RANGE 

C - Commercial (0 to + 70°O 



- c. PACKAGE TYPE 

0-28-Pin Sidebrazed Ceramic DIP (SD4028) 



b. SPEED OPTION 

Not Applicable 



Valid Combinations 



DC, DCB, 



Valid Combinations 

Valid Combinations list configurations planned to be 
supported in volume for this device. Consult the local AMD 
sales office to confirm availability of specific valid 
combinations, to check on newly released combinations, and 
to obtain additional data on AMD's standard military grade 
products. 
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ORDERING INFORMATION (Cont'd.) 
APL Products 

AMD products for Aerospace and Defense applications are available in several packages and operating ranges. APL (Approved 
Products List) products are fully compliant with MIL-STD-883C requirements. Ttie order number (Valid Combination) for APL 
products is formed by a combination of: a. Device Number 

b. Speed Option (if applicable) 

0. Device Class 

d. Paclcage Type 

e. Lead Finish 



/B 



-a. DEVICE NUMBER/DESCRIPTION 

Am29337 

16-Bit Bounds Checker 



-e. LEAD FINISH 

C = Gold 



-d. PACKAGE TYPE (per 09-000) 

X = 28-Pin (400 mil) Sidebrazed Ceramic Dip 
(SD4028) 



-c DEVICE CLASS 

/B = Class B 



b. SPEED OPTION 

Not Applicable 



Valid Combinations 



/BXC 



Valid Combinations 

Valid Combinations list configurations planned to be 
supported In volume for this device. Consult the local AMD 
sales office to confirm availability of specific valid 
combinations or to check for newly released valid 
combinations. 



Group A Tests 

Group A tests consist of Subgroups 
1, 2, 3, 7, 8, 9, 10, 11. 
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PIN DESCRIPTION 



CIl, CIu Carry-In (Inputs) 

Carry input for cascading. 

COl, COu Carry Out (Outputs) 

Carry outputs for the result of comparison. 

CP System Clock (Input) 

Clocks limit registers at the LOW-to-HIGH transition. 

D0-D15 Data Input (Input) 

Input to the comparators and limit registers. 



ENl, ENu Load Enable (Inputs) 

Loads enables for the limit registers. 

OOB Out-of-Bounds Hag (Output) 

Flags values that are out of txiunds. Defined as COf COu- 

SIGNED Sign Input (Input) 

Selects signed comparisons when HIGH and unsigned 
comparisons when LOW. 



FUNCTIONAL DESCRIPTION 

The Am29337 is a high-speed tx)unds checker that deter- 
mines if a 1 6-bit number lies within a lower and an upper limit. 
It consists of two comparators and two limit registers, as 
shown in the Block Diagram. 

Limit Registers, Double Comparator 

The Am29337 has a lower limit register and an upper limit 
register. The values of these two registers are loaded from the 
D-bus with the load enable inputs ENl and ENu on the clock's 
rising edge. The values of the data present on the D-bus are 
compared with the values stored in the limit registers through 
the two comparators. The comparators operate on signed 
numbers when SIGNED is HIGH and on unsigned numbers 
when it is LOW. The results of the comparisons are given by 
the outputs COl, COu, and OOB. The definitions of cany 
inputs CIl and Cly are given in Table 1, and the combination 
of the different regions in Table 2. If the data being compared 
is out of th e region, t he out-of-bounds flag, OOB, which is 
defined as COl'COu, is set. 



Cascading 

Comparison of numbers longer than 16 bits requires cascad- 
ing of two or more bounds-checker slices. Figure 1 shows an 
example of this for a 32-bit bounds checker. The comparison 
starts from the least significant slice. COl, COu. and OOB of 
the most significant slice act as outputs of the overall bounds 
checker, while COl a"d COu 0' the least significant slice are 
connected to CIl and CIu of the most significant slice. CIl and 
CIu of the least significant slice act as inputs to the overall 
bounds checker. The SIGNED input of the most significant 
slice identifies the value when being compared with either 
signed or unsigned number when the SIGNED input of the 
least significant slice is tied LOW. 

The comparison can start from the most significant slice. In 
this case, COl, COu, COB of the least significant slice act as 
outputs of the overall bounds checker, while COl and COu o' 
the most significant slice are connected to CIl and CIu of the 
least significant slice. 
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TABLE 1. DEFINITION OF COl AND COy 



Inputs 


Outputs 


CIl 


Clu 


COl 


COu 








L<D 


D<U 





1 


L<D 


D<U 


1 





L<D 


D<U 


1 


1 


L<D 


D<U 



Note: 

D = Data Input 
L = Lower Unit 
U = Upper Unit 



TABLE 2. DIFFERENT COMBINATIONS OF REGIONS 



Inputs 


Outputs 


Description 


CIl 


Clu 


COl 


COu 


OOB 
















Impossible 
Combination 









D<L 








U<D 








L<D<U 





1 








Impossible 
Combination 








D<L 








U<D 








L<D<U 


1 











Impossible 
Combination 








D<L 








U<D 








L<D<U 


1 


1 









Impossible 
Combination 









D<L 









U<D 









L<D<U 



CO, OOB C0|, 



i i i 



CO, OOB COu 



Figure 1. 32-Bit Bounds Checker 
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ABSOLUTE MAXIMUM RATIN 

Storage Temperature -65 


GS 

to -usee 

to +125°C 


OPERATING RANGES 

Commercial (C) Devices 

Temperature (Ta) n tn 




Temperature Under Bias— Tc -55 


+ 7nT: 


Supply Voltage to Ground 


SuDDiv Voltaae fVnn) 


+4.75 to +fi9f; V 1 


Potential Continuous -ns tn +70 v 






DC Voltage AppI 
for HIGH State 
DC Input Voltage 
DC Output Curre 
DC Input Current 


ed to Outputs 


Vcc Max. 


Military (M) Devices 

Temperature (Tc) -55 to 


+ 125°C 


n 


Supply Voltage (Vcc) 


+ 4.5 to +5.5 V 


It, into Outputs 30 mA Operating ranges define those limits between which the 


Stresses above those listed under ABSOLUTE MAXIMUM Themal Resistance (Preliminary) - SD4028 

RATINGS may cause permanent device failure. Functionality 8ja = 40°C/W 

at or above these limits is not implied. Exposure to absolute Sjc = 1 SX/W 

maximum ratings for extended periods may affect device 

reliability. 

DC CHARACTERISTICS over operating range unless otherwise specified (for APL Products, Group A, 
Subgroups 1, 2, 3 are tested unless otherwise noted) 


Parameter 
Symbol 


Parameter 
Description 


Test CondKlons (Note 1) 


Min. 


Max. 


Units 


VOH 


Output HIGH Voltage 


Vex; = Min., V|N = V|LOr V|H 
IOH = -10 "lA 


2.4 




V 


Vol 


Output LOW Voltage 


Vcc = Min., V|N = V|LOr V|H 
IOL = 8.0 mA 




0.5 


V 


V|H 


Input HIGH Level 


Guaranteed Input Logical 
HIGH Voltage for All Inputs 


2.0 




V 


ViL 


Input LOW Level 


Guaranteed Input Logical 
LOW Voltage for All Inputs 




0.8 


V 


V| 


Input Clamp Voltage 


Vcc = Min., I|N = -18 mA 




-1.2 


V 


l|L 


Input LOW Cun-ent 


Vcc = Max., V|N = 0.5 V 




-0.5 


mA 


l|H 


Input HIGH Current 


Vcc = Max., V|N = 2.4 V 




50 


HA 


h 


Input HIGH Current 


Vcc = Max., V|N = 5.5 V 




1 


mA 


lOZH 
loZL 


Fo - F31 Off State 
(High Impedance) 
Output Cun-ent 


Vcc - Max. 


Vo = 2.4 V 




25 


UA 


Vq = 0.4 V 




-25 


isc 


Output Short-Circuit 
Current (Note 2) 


Vcc = Max., Vq » V 


-15 


-50 


mA 


Ice 


Power Supply Current 


Vcc = Max. 


Ta = +25X 




180 


mA 


Ta = to + 70°C 




230 


Ta = +70°C 




220 


Tc = -55 to 125°C 




235 


Tc=125X 




215 


Notes: 1. For conditions as Min. or Max., use the appropriate value specified under Operating Ranges for the applicable device type. 
2. Not more than one output should be shorted at a time. Duration of the short-circuit test should not exceed one second. 
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SWITCHING CHARACTERISTICS over operating range unless othenwise specified (for APL Products, 
Subgroups 9, 10, 11 are tested unless otherwise noted) 



No. 



Parameter 
Symbol 



COM'L 



Max. Delay 



MIL 



Max. Delay 



Units 



IPD 



Do-Di5 to COl, COu, OOB 



21 



23 



tpc 



CIl, CIu to COl, COu, OOB 



13 



14 



tps 



SIGNED to COl, COu, OOB 



18 



18 



tCPO 



CP to COl, COu, OOB 



22 



24 



tSD 



Do-Di5 Setup Time With Regard to CP T 



12 



13 



tSL 



ENl, ENu Setup Time With Regard to CP T 



12 



13 



'HD 



D0-D15 Hold Time 



tHL 



ENl, ENu Hold Time 



tpWL 



Clocl< Pulse Width LOW 



12 



12 



10 



IPWH 



Clocl< Pulse Width HIGH 



12 



12 
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SWITCHING TEST CIRCUIT 



*cc 



VouT O^O— J * j^ 



"2 < -T ^ 



R2 = 



2.4 V 
lOH 



Ri = 



S.O-Vbe-Vql 
lOL + VOL 



R2 



Normal Outputs 



Notes: 1. Cl- 50 pF includes scope probe, wiring, and stray capacitances without device in test fixture. 

2. Si is closed during function tests and all AC tests except output enable tests. 

3. Cl ~ 5.0 pF for output disable tests. 



SWITCHING WAVEFORMS 
KEY TO SWITCHING WAVEFORMS 



WAVEFORM 



" \^ ?5iJ=2^,^ ?Hii^i« 



FROM H TO L 



W€ 



FFH9M H TO L 



WILL BE 
CHANCING 
FROM I. TO H 



DONTCARE; CHANQiNG; 

ANV CHANGE ST Ate 

PERMITTED LMKHOWN 



CENTER 
DOESNOT LINE IS HIGH 

APPLY lh«>EDANCE 

"OFF" STATE 
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SWITCHING WAVEFORMS (Cont'd.) 

K © H 



V^u 



CIl-CIu 



SIGNED 



COL.COy.OOB 



I 



^ © ► 



-®- 



)C 



Propagation Delays from Data Input to Output 



CP 



V°15 



ENl-ENu 



CO[^,COy,O0B 



/ 



y 



< — ® — ► 



A — > 



•*— ► 




"•♦ — © 



\ 



< — © — > 



X 



WF023040 



Loading the Limit Registers 
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INPUT/OUTPUT CIRCUIT DIAGRAM 



DRIVING OUTPUT 



DRIVEN INPUT 



*0H 



< 



< 



lOL 



o » 



— y 



H4- 



< 



^ 



C| a 5.0 pF, All inputs 



ICR00480 

Co* 5.0 pF, all outputs 
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32-Bit Byte Queue 



ADVANCE INFORMATION 



DISTINCTIVE CHARACTERISTICS 



Intelligent FIFO Array 

- Array of four intelligent FIFO buffers, each 9 bits wide, 
32 bits deep (RAM-based) 

Queuing/Dequeuing 

- Allows variable width queuing/dequeuing in one cycle 
Byte Rotation 

- Four bytes can be rotated at the input as well as at the 
output of the Byte Queue. This allows interfacing 
between incompatible byte assignments. 



Asynchronous and Synchronous Operation 

- Supports communication between systems with differ- 
ent clocks and different bus widths 

Retransmit 

- Data can be read out repeatedly 
Horizontal Cascading 

- Up to four devices allow simultaneous input or output 
up to 16 bytes 

Parity Checl< 

- Protects data at the input and the output 



GENERAL DESCRIPTION 



The Am29338 is an intelligent FIFO that allows up to four 
bytes to be queued and up to four bytes to be dequeued in 
a single cycle. When four devices are cascaded horizontal- 
ly, up to sixteen bytes can be dequeued in a single cycle. 

The Am29338 queues variable-length data by disassem- 
bling the input data, which is aligned on the least-significant 
byte of the input bus (D), into individual bytes. These bytes 
are packed internally in FIFO (first-in, first-out) order. The 
data to be dequeued is unpacked and realigned to the 
least-significant byte of the output bus (Y). Queuing and 
dequeuing can be performed simultaneously. With the 



retransmit capability, the part can repeatedly send the 
block of data stored in the queue without having to requeue 
it. This is Useful for retransmitting a block of data upon 
receipt of an error in I/O applications or for loop-locking in 
instruction-prefetch applications. 

The queue operates in synchronous or asynchronous 
mode, and is useful as an instruction-prefetch queue or as 
a general-purpose FIFO buffer. 

The device is manufactured in AMD's bipolar IMOX* 
technology and comes in a 120-lead pin-grid-array pack- 
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BLOCK DIAGRAM 
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PDERR 
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0» 
X 
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and 
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^ 
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Byte Rotato 
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Parity 
Check 



Full 
Almost Full 



Bytes in Queue 
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Almost Empty 



FULL 
A-FULL 



t> 



CNTo-6 

EMPTY 
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Parity 
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This document contains information on a product under development at Advanced Micro Devices, 

Inc. The information is intended to tielp you to evaluate this product AlulD resen/es 

the right to change or discontinue work on this proposed product without notice. 3..| .jg 



Publication # Rev. Amendment 

08815 B /O 
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RELATED AMD PRODUCTS 



Part No. 


Description 


Am2900 Family 


4-Bit Microprocessor Slice Family 


Ann29C00 Family 


CMOS 4-Bit Microprocessor Slice Family 


Am29C101 


CMOS 16-Bit Microprocessor Slice 


Am29114 


Real-Time Interrupt Controller 


Am29116 


16-Bit Bipolar Microprocessor 


Am29116A 


High-Speed 16-Bit Bipolar Microprocessor 


Am29L116A 


Low-Power 16-Bit Bipolar Microprocessor 


Am29C116 


CMOS 16-Bit Microprocessor 


Am29C116-1 


CMOS 16-Bit Microprocessor 


Am29325 


32-Bit Floating Point Processor 


Am29C325 


CMOS 32-Bit Floating Point Processor 


Am29331 


16-Bit Microprogram Sequencer 


Am29C331 


CMOS 16-Bit Microprogram Sequencer 


Am29332 


32-Bit Extended Function ALU 


Am29C332 


CMOS 32-Bit Extended Function ALU 


Am29334 


Four-Port, Dual-Access Register File 


Am29C334 


CMOS Four-Port, Dual-Access Register File 


Am29337 


16-Bit Cascadable Bounds Checker 
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CONNECTION DIAGRAM 
Bottom View 



E F G H 



1 /yIS YTT SE vii GNOT PY3 Y27 Y28 VCCT CNT2 GNDT CMT6 B0M\ 

PY2 Y1S Y18 Y20 Y23 Y24 Y26 Y29 Y31 CKTl CNT4 CNTS SDQ2 

GNDT Y14 Y13 Y19 Y22 VCCE Y25 GNDE Y30 CKTO CNI3 BOQO B0Q1 

Y12 Y11 Yio BSER RESET BOTT 

aSi BSW1 DQCLK 



12 



VCCT Y9 Y8 



PY1 GNDT 



Y3 VCCT 



GNDT Y1 YO 



PYO PYERR PDERH 



QCLK B01 BSWO 



BOO NC D30 



D31 D28 D29 



D27 D2S D2E 



D24 P03 D23 



VCCT A-FULL POO 02 VCCE DG 07 012 GNDE 015 022 020 021 



FULL P0S1 POSO D1 VCCE 03 08 09 GNDE 014 P02 019 018 



V-EMPTY EMPTY 05 00 VCCE 04 PD1 D11 GNDE D13 D10 016 017 



CD011040 



Legend: GNDE: GND, ECL 
GNDT: GND, TTL 
VCCE: Vcc, ECL 
VCCT: Vcc. TTL 
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PIN DESIGNATIONS 

(Sorted by Pin Number) 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


1 


A1 


Yl6 


115 


C5 


Ye 


40 


Gil 


D? 


27 


LIO 


D24 


120 


A2 


PY2 


113 


C6 


GND, TTL 


36 


G12 


De 


88 


L11 


D22 


59 


A3 


GND, TTL 


52 


C7 


Y4 


96 


G13 


PDi 


32 


L12 


PD2 


58 


A4 


Y12 


53 


C8 


Vcc TTL 


69 


HI 


Y28 


35 


L13 


D10 


56 


A5 


Vcc. TTL 


109 


C9 


Yo 


10 


H2 


Y29 


75 


Ml 


CNTe 


114 


A6 


Y/ 


48 


CIO 


PDERR 


68 


H3 


GND, ECL 


15 


M2 


CNT5 


54 


A7 


Y6 


44 


C11 


PDo 


34 


H11 


Di2 


77 


M3 


BDQo 


51 


A8 


Y2 


104 


C12 


POSo 


95 


H12 


Dg 


78 


M4 


RESET 


50 


A9 


GND, TTL 


41 


013 


D5 


94 


H13 


Dii 


80 


MS 


BSWi 


49 


AID 


PYo 


4 


D1 


Y21 


11 


J1 


Vcc. TTL 


81 


M6 


BQi 


47 


A11 


Vcc. TTL 


63 


D2 


Y20 


71 


J2 


Y31 


82 


M7 


NC 


106 


A12 


FULL 


3 


D3 


Y19 


70 


J3 


Y30 


25 


M8 


D28 


46 


A13 


A-EMPTY 


102 


D11 


D2 


38 


J11 


GND, ECL 


86 


M9 


D25 


61 


B1 


Y17 


43 


012 


Dl 


38 


J12 


GND, ECL 


87 


M10 


PD3 


60 


B2 


Y15 


103 


D13 


Do 


38 


J13 


GND, ECL 


89 


Mil 


D20 


119 


B3 


Y14 


5 


El 


GND, TTL 


13 


K1 


CNT2 


30 


M12 


Dl9 


117 


84 


Y11 


65 


E2 


Y23 


72 


K2 


CNTi 


91 


M13 


D16 


116 


B5 


Yg 


64 


E3 


Y22 


12 


K3 


GNTo 


16 


N1 


BDQ3 


55 


B6 


PY1 


98 


E11 


Vcc. ECL 


92 


K11 


Dl5 


76 


N2 


BDQ2 


112 


B7 


Ys 


98 


E12 


Vcc. ECL 


33 


K12 


Di4 


17 


N3 


BDQi 


111 


B8 


Y3 


98 


E13 


Vcc. ECL 


93 


K13 


Dl3 


19 


N4 


RXMIT 


110 


B9 


Yl 


6 


F1 


PY3 


14 


LI 


GND, TTL 


20 


N5 


DQCLK 


108 


BIO 


PYERR 


66 


F2 


Y24 


74 


L2 


CNT4 


21 


N6 


BSWo 


107 


B11 


A-FULL 


8 


F3 


Vcc. ECL 


73 


L3 


CNTa 


24 


N7 


D30 1 


45 


B12 


POS-i 


100 


F11 


D6 


18 


L4 


DQEN 


84 


N8 


D29 1 


105 


B13 


EMPTY 


42 


F12 


D3 


79 


L5 


Oen 


26 


N9 


D26 1 


2 


CI 


OE 


101 


F13 


D4 


23 


L6 


QCLK 


28 


NIC 


D23 


62 


C2 


Y18 


9 


G1 


Y27 


22 


L7 


BQo 


29 


N11 


D21 


118 


C3 


Y13 


67 


G2 


Y26 


83 


LB 


D31 


90 


N12 


D18 


57 


C4 


Yio 


7 


G3 


Y25 


85 


L9 


D27 


31 


N13 


D17 
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PIN DESIGNATIONS 

(Sorted by Pin Name) 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


^ PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


82 


M7 


NC 


34 


H11 


D12 


59 


A3 


GND, TTL 


51 


A8 


Y2 


46 


A13 


A-EMPTY 


93 


K13 


Dl3 


5 


El 


GND, TTL 


111 


B8 


Y3 


107 


B11 


A-FULL 


33 


K12 


Di4 


50 


A9 


GND, TTL 


52 


07 


Y4 


77 


MS 


BDQo 


92 


K11 


D15 


2 


CI 


Of 


112 


B7 


Y5 


17 


N3 


BOQi 


91 


M13 


D16 


44 


C11 


PDo 


54 


A7 


Ye 


76 


N2 


BDQ2 


31 


N13 


Di7 


96 


G13 


PD, 


114 


A6 


Y7 


16 


N1 


BOQ3 


90 


N12 


D18 


32 


LI 2 


PD2 


115 


05 


Yg 


22 


L7 


BQo 


30 


M12 


Di9 


87 


M10 


PD3 


116 


85 


Y9 


81 


M6 


BQi 


89 


Mil 


D20 


48 


010 


PDERR 


57 


04 


Y10 


21 


N6 


BSWo 


29 


Nil 


D21 


104 


012 


POSo 


117 


B4 


Y11 


80 


M5 


BSWi 


88 


L11 


D22 


45 


B12 


POS1 


58 


A4 


Y12 


12 


K3 


CNTo 


28 


N10 


D23 


49 


A10 


PYO 


118 


03 


Yi3 


72 


K2 


CNTi 


27 


L10 


D24 


55 


86 


PYi 


119 


B3 


Yl4 


13 
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LOGIC SYMBOL 
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Die Size: 270x290 mils^ 
Gate Count: 9000 
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ORDERING INFORMATION 
Standard Products 



AMD standard products are available in several packages and operating ranges. The order number (Valid Combination) is formed by 
a combination of: a. Device Number 

b. Speed Option (if applicable) 

c. Package Type 

d. Temperature Range 

e. Optional Processing 



AM29338 



B 



-a. DEVICE NUMBER/DESCRIPTION 

Amsgsss 

Byte Queue 



-e. OPTIONAL PROCESSING 

Blank = Standard processing 
B = Burn-in 



-d. TEMPERATURE RANGE 

C - Commercial (0 to + 86°C) 



-C. PACKAGE TYPE 

G -= 120-Lead Pin Gnd Array with Heatsinl< 
(CG 120) 



b. SPEED OPTION 

Not Applicable 



Valid Combinations 



AM29338 



GC, GCB 



Valid Combinations 

Valid Combinations list configurations planned to be 
supported in volume for this device. Consult the local AMD 
sales office to confirm availability of specific valid 
combinations, to check on newly released combinations, and 
to obtain additional data on AMD's standard military grade 
products. 
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PIN DESCRIPTION 



A-EMPTY Almost Empty (Output; Active HIGH) 

Indicates that there are less than four bytes of data in the 
queue. It is used in either synchronous or asynchronous 
operation. 

A-FULL Almost Full (Output; Active HIGH) 

Indicates that there are less than four bytes of space 
remaining. It is used in either synchronous or asynchronous 
operation. 

BDQg-BDQs Bytes Dequeued (Input) 

Selects the number of bytes to be dequeued (see Table 2). 
The byte queue must operate synchronously to be able to 
dequeue more than four bytes in a single cycle. 

BQo-BQi Bytes Queued (Input) 

Selects the number of bytes to be queued (see Table 1). 

BSWo-BSWi Byte Swap (Input) 

Allows the bytes on the input to be reordered (see Table 3). 

CNTo-CNTg Byte Count (Output) 

Gives the cun'ent number of bytes in the queue. These are 
used only in synchronous operation. 

D0-D31 Data Input (Input) 

Data inputs to be queued. 

DQCLK Dequeue Clock (Input) 

Dequeues the number of bytes set up on the Y bus. A LOW- 
to-HIGM transition on this input adjusts the internal dequeue 
pointers by the number set up on the BDQ lines. 



DOEN Deque ue Enable (Input; Active LOW) 

While DQEN is LOW, dequeuing is performed normally. 
When DQEN is HIGH, DQCLK is disabled. 

EMPTY Empty (Output; Active HIGH) 

Indicates that the queue is empty. It is used in either 
synchronous or asynchronous operation. 

FULL Full (Output; Active HIGH) 

Indicates that the queue is full. It is used In either 
synchronous or asynchronous operation. 

51 Output Enable (Input; Active LOW) 

When DE is LOW, the four bytes follovwng the current 
dequeue pointer and the corresponding parity bits are on Y 
and PY outputs. When (5E is HIGH, Y and PY outputs are 
three stated. 

PD0-PD3 Data Input Parity (Input) 

The input parity bits for the corresponding byte on the 
inputs. Only the bytes to be queued and the corresponding 



PD lines are checked for possible parity error. The byte 
queue has the even parity. 

PDERR Data Input Parity Error (Output; Active 
HIGH) 

If any of the bytes to be queued have a parity en-or, PDERR 
is asserted. 

POSq-POSi Position (Input) 

These inputs are used to program the location of each byte 
queue in horizontally cascaded system upon RESET (see 
Table 4). 

PY0-PY3 Output Data Parity (Output; Three State) 

The output parity bits for Y outputs. When 01 is HIGH, the 
parity bits of the four bytes following the dequeue pointer 
appear on these outputs. The byte queue has the even 
parity. 

PYERR Y Output Parity Error (Output; Active HIGH) 

If any of the bytes on the output has a parity error, PYERR is 
asserted. 

QCLK Queue Clock (Input) 

When QCLK is LOW, the number of bytes set up on the BQ 
lines are written into the next free space in the queue from 
the data set up on the inputs. On a LOW-to-HIGH 
transition o f this input, the internal queue pointers are 
updated. If QEN is HIGH, QCLK has no effect. 

Sen Queu e Enable (Input; Active LOW) 

When QEN is LOW, queuing is performed normally. When 
5lN is HIGH, QCLK is disabled. 

RESET Resit (Input; Active LOW) 

When RESET is LOW, both the internal queue pointer and 
the internal dequeue pointer are reset to the first RAM 
location and both EMPTY and A EMPTY are asserted. 



RXMIT Retra nsmit (Input; Active LOW) 

When RXMIT is LOW, the internal dequeue pointers are 
reset to the first RAM location while the internal queue 
pointers remain unchanged. This allows the data contained 
between the current queue pointer and the first RAM 
location to beco me ava ilable for dequeuing again, The 
effect of asserting RXMIT is defined only if 1 28 bytes o r less 
have been queued since the last assertion of RESET (see 
Figure 5). 

Y0-Y31 Data Output (Output; Three State) 

The four bytes following the current dequeue pointer appear 
on these outputs when 51 is LOW. When DE is HIGH, they 
are three stated. 
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FUNCTIONAL DESCRIPTION 
Architecture 

The Am29338 is a 32-bit liigh-performance general-purpose 
intelligent FIFO that stores up to 1 28 bytes in the internal RAM 
slices and queues or dequeues up to four bytes in a single 
cycle. The byte queue is divided into five functional blocl<s: 1) 
four memory-slice logics, 2) byte rotators for input and output 
buses, 3) rotate-enable logic, 4) byte-count logic, and 5) full/ 
empty-generate logic. The byte-oriented parity checking is 
provided on both the D-input bus and the Y-output bus. Figure 
1 shows a detailed block diagram of the byte queue. 

Memory-Slice Logic 

Figure 2 shows a detail of the memory-slice logic. It consists of 
a 32 X 9 RAM, queue and dequeue pointers, adders for the 
pointers, and a full/empty detector. The RAM has indepen- 
dent 9-bit read and write ports. Both ports are accessible 
simultaneously if different RAM locations are operated on. A 
parity bit is stored along with its corresponding byte into the 
RAM. 

The queue and dequeue pointers point to the next location 
available for dequeuing. The next locations are produced by 
the internal adders with BQp _ i or BDQo-3 and the current 
pointer values. When RESET is asserted, both pointers are set 
to zero and the RAM is flushed. These pointers are also used 
to indicate that the RAM is either empty or full for each 
memory slice. The slice-empty or slice-full signal is used to 



combinationally form FULL, A-FULL, EMPTY, and A-EMPTY 
signals. 

Byte Rotator 

There are two byte rotators in the byte queue. Each accepts 
36-bit wide data and performs rotation of bytes according to 
the 2-bit rotate values fed from the rotate-enable logic. The 
input byte rotator realigns and stores the bytes to be queued 
into the next free slice location. The output byte rotator 
realigns the bytes to be dequeued to the least significant byte 
of the Y-output bus. 

Rotate-Enable Logic 

The queue and dequeue rotate-enable logic keeps track of 
which slice holds the first byte of the next queue/dequeue 
operation. A modulo-4 counter is used to rotate the data in 
operation and enables the correct slices by the number of 
bytes specified by either BQq-i or BDQ0-3. 

The queue rotate-enable logic also performs byte and/or word 
swaps on the incoming data. The input bytes are swapped in 
one of four ways, according to Table 3, with BSWq - 1 and the 
current modulo-4 byte count through the input byte rotator. 

Byte-Count Logic 

This logic consists of a queue count register and a dequeue 
count register. The registers are incremented during a queue/ 
dequeue operation by the number of bytes in the operation. 
The combinational subtract logic outside of these registers 
determines the number of bytes stored in the byte queue. 
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Figure 1. Am29338 Byte Queue Detailed Block Diagram 
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Figure 2. Memory and Slice Logic 
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Figure 5. Retransmit Function with tlie Am29338 
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Figure 6. Queuing witii tlie Am29338 

Notes: 1. Each of the four segments stands for a memory size; MSB = Most-Significant Byte, and 
L^B '^ Least-Significant Byte. 
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Figure 7. Dequeuing witli the Am29338 

Notes: 1 . Each of the four segments stands for a memory size; MSB = Most-Significant Byte, and 
LSB = Least-Significant Byte. 
2. First, one byte is dequeued ('A'), followed by a dequeue of two bytes ('CB'). 

TABLE 1. SELECTING THE NUiUBER OF BYTES TO BE QUEUED 
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TABLE 2. SELECTiNG THE NUMBER OF BYTES TO BE DEQUEUED 
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This is possible when four of the byte queues are cascaded together. The byte queue must be operated 
synchronously to select more than four bytes for dequeuing. 
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TABLE 3. ENCODING OF BSW INPUTS 
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Note: The assumption is made tliat tlie 32-bit data "A B C D" appears on the input bus. 



TABLE 4. LOCATION IDENTIFICATION FOR HORIZONTAL CASCADING 
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Note: "0" stands tor the least significant chip and "3" the most significant chip. 



Operational Modes 

General Operation 

To enter data into the Am29338, the number of bytes to be 
queued is set up on the Bytes Queued (BQ) pins; the 
corresponding data to be queued is set up on the Data Input 
(D) and Data Input Parity (PD) pins , aligned to the least- 
significant byte. If Queue Enable (QEN) is asserted, the data is 
entered into the Am29338 while the Queue Clock (QCLK) is 
LOW, and the internal queue pointers are updated on the 
LOW-to-HIGH transition of QCLK. 

Figure 6 shows an example of two bytes being queued, 
followed by three bytes being queued. Data is packed in the 
Am29338 so that no holes exist. 

If Output Enable (OE) is asserted, the first four bytes available 
for dequeuing and their corresponding parity appear on the 
Data Output (Y) and Data Parity (PY) pins. The number of 
these bytes to be dequeued is set up on the Bytes Dequeued 
(BDQ) pins. If Dequeue Enable (DQEN) is asserted, the LOW- 
to-HIGH transition of Dequeue Clock (DQCLK) updates the 
internal dequeue pointers, removing the dequeued bytes. 

Figure 7 shows an example of one byte dequeued, followed by 
a dequeue of two bytes. The data to be dequeued next is 
least-significant-byte aligned on the output bus. 

Synchronous Mode 

Both synchronous and asynchronous operations are available 
for the byte queue. During synchronous operation, both QCLK 
and DQCLK must be asserted on the edge of a common clock 
within certain skew limits. The following signals can be used 
as valid status outputs for this mode: FULL, A-FULL, EMPTY, 
A-EMPTY, and CNTo-e- Refer to the applications section for 
an example. 



Asynchronous Mode 

During asynchronous operation, QCLK and DQCLK clocks 
may be different It is possible to execute queue and dequeue 
operations simultaneously if different locations are accessed. 
In this mode, CNT outputs are not guaranteed as valid and 
horizontal cascading is not possible. Refer to the applications 
section for an example. 

Horizontal Cascading 

In synchronous operation, four byte queues can be horizontal- 
ly cascaded together. In this case, each of the four byte 
queues hold the same data and up to sixteen bytes may be 
dequeued in a single cycle, as shown in Table 2, and Figures 3 
and 4. Each part has to be programmed with its position by the 
POS inputs, as shown in Table 4. In a normal operation, the 
internal dequeue pointer of each part is displaced according to 
the POS inputs. When RESET or RXMIT is asserted, the 
dequeue pointers are offset by the value programmed on the 
POS inputs. 

Horizontal cascading is useful in instruction buffers designed 
for systems with large, variable instructions that can span 
many bytes. 

APPLICATIONS 

Using Am29338 as an Instruction-Prefetch 
Queue 

Figure 8 shows the Am29338 used as an instruction-prefetch 
queue. Sequential 32-bit memory locations are fetched by the 
Instruction Fetch Unit (IFU) and are queued up in the byte 
queue. When the central processor needs the next instruction, 
it looks at the next four bytes from the byte queue. The central 
processor then determines the instruction length from the 
opcode and updates the dequeue pointer in the byte queue by 
setting up the instruction length on the BDQ lines and 
asserting DQCLK. When a jump occurs, the IFU flushes the 
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queue by asserting the RESET input and begins from the new 
address. For this application, the byte queue must be in 
synchronous mode. 



Using the RXMIT input, the byte queue can resend the block 
data through dequeuing rather than having to requeue it. This 
is useful for locking the loops into the byte queue and allows 
the processor to run faster than if it had to refetch instructions 
from memory or cache. Figure 9 illustrates how a loop can 
execute directly out of the byte queue. 

Using Am29338 as a Hardware Mailbox in 
Multiprocessing System 

A mailbox is a communication device between loosely coupled 
processes in a multi-programming system. Messages from 
one process to another are queued in the mailbox on a first-in, 
first-out (FIFO) basis. In a multiprocessing system, hardware 
mailboxes are required. This can be implemented using the 
Am29338 as shown in Figure 10. 

When a process wishes to send a message to the mailbox, it 
calls a special operating-system routine. This routine first 



reads the status of the mailbox; if it is not FULL, the routine 
first writes the message to the mailbox and returns to the 
calling process. If the mailbox is FULL, the operating system 
blocks the calling process on a special queue and enables 
intenrupts from the mailtrax. When a slot becomes available in 
the mailbox, the sending processor is interrupted. The inter- 
rupt routine sends the message to the mailbox, disables 
interrupts from the mailbox, and unblocks the blocked pro- 
cess. On the receiving side, the EMPTY status of the mailbox 
must be available to the receiving processor in order to allow 
the receiving process to be blocked if the mailbox is empty. 
When a mailbox slot becomes filled, a blocked process must 
be awakened by interrupting the receiving processor. 

The mailbox can be extended to operate in a heterogeneous 
multiprocessing system. In this type of system, processors 
with varying data-path widths and clock frequencies are 
interconnected. For example, a 32-bit main processor may 
control 8- to 16-bit coprocessors. The ability of the Am29338 
to match data-path widths and to queue and dequeue asyn- 
chronously allows processors of different widths and clock 
rates to communicate, 
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Figure 8. Instruction-Prefetch Queue 
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Figure 10. Implementation of a Hardware Mailbox 



Suggestions for Power and Ground Pin 
Connections 

The Ann29338 operates in an environment of fast signai rise 
times and substantial switcliing currents. Tlierefore, care must 
be exercised during circuit board design and iayout, as witin 
any high-performance component. The following is a sug- 
gested layout, but since systems vary widely in electrical 
configuration, an empirical evaluation of the intended layout is 
recommended. 

The VccT and GNDT pins, which carry output driver switching 
cun^nts, tend to be electrically noisy. The VccE aid GNDE 
pins, which supply the ECL core of the device, tend to produce 
less noise, and the circuits they supply may be adversely 
affected by noise spil<es on the Voce plane. For this reason, it 
is best to provide isolation between the Vqce and VccT pins, 
as well as independent decoupling for each. Isolating the 
GNDE and GNDT pins is not required. 



Printed Circuit-Board Layout Suggestions 

1 . Use of a multi-layer PC board with separate power, ground, 
and signal planes is highly recommended. 

2. All VccE and VccT Pins should be connected to the Vcc 
plane. VccT Pins should be isolated from VccE Pins by means 
of a slot out in the VccE plane; see Figure 1 1 . By physically 
separating the VccE and VccT Pins, coupled noise will be 
reduced. 

3. All GNDE and GNDT pins should be connected directly to 
the ground plane. 

4. The VccT Pins should be decoupled to ground with a 0.1 -mF 
ceramic capacitor and a lO-juF electrolytic capacitor, placed 
as closely to the Am29338 as is practical. VccE pins should 
be decoupled to ground in a similar manner. 

A suggested layout is shown in Figure 11. 
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Figure 11. Suggested Printed Circuit-Board Layout 
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ABSOLUTE MAXIMUM RATINGS 

storage Temperature -65 to +150°C 

Case Temperature 

with Power Applied -55 to +125°C 

Supply Voltage 

with Respect to Ground -0.5 to +7.0 V 

DC Voltage Applied to Outputs 

for HIGH State -0.5 V to +Vcc Max. 

DC Input Voltage -0.5 V to +5.5 V 

Stresses above those listed under ABSOLUTE MAXIMUM 
RATINGS may cause permanent device failure. Functionality 
at or above these limits is not implied. Exposure to absolute 
maximum ratings for extended periods may affect device 
reliability. 



OPERATING RANGES 

Commercial (C) Devices 

Case Temperature (Tc) to + 85°C 

Supply Voltage (Vcc) + 4.75 to +5.25 V 

fljA (under 200 Ifm) 

Operating ranges define those limits between which the 
functionality of the device is guaranteed. 



DC CHARACTERISTICS over operating ranges unless otherwise ^ecS|i«iJ 



r^ 



Parameter 
Symbol 



Parameter 
Description 



Test Conditlbifir. 

(Notejj 






Typ. 

Mm (Note 2) 



Max. 



Unit 



VOH 



Output HIGH Voltage 



V{T — Min 

V|N = V|LOr V|H #■%"♦ 

I0H--3 mA ,^^^_J__ 



Vol 



Output LOW Voltage 



Vcc = i^iri. ,#f 

V|N = V|L qfcW 
IOL=16„mJ%J 



IJIH" Logical 
"^^ffor All Inputs 



VlH 



Input HIGH Level 



2.0 



^^^ntee^lnput Logical 
-fo# Voltage for All Inputs 



V|L 



Input LOW Level 




Input Clamp Voltage 



l|L 



Input LOW Current 



'^^ 



QCLK, DQGLK 



Others 



l|H 



Input HIGH Current 




Vcc = Max. 
V|N = 2.4 V 



MA 



Input HIGH Current *%., 



Vcc - Max. 
V|N - 5.5 V 



lOZH 
lOZL 



Vcc ■= Max. 



Vo = 2.4 V 



Vo-0.6 V 



mA 



isc 



Current 



Vcc -Max, to +0.6 V 
Vo = 0.5 V 



Ice 



Power Supply Current 



Vcc = Max, 
All Inputs HIGH 



Tc - to + 85°C 



Tc - + 86°C 



800 



Notes: 1, For conditions shown as MIn. or Max., use the appropriate value specified under Operating Ranges for the applicable device 

2, Typical values are for Vcc = + 25''C ambient and maximum loading, 

3. Not more than one output should be shorted at a time. Duration of the short-circuit test should not exceed one second 



type. 
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SWITCHING CHARACTERISTICS over operating range {Note 1) 

A. Combinational Propagation Delays 


No. 


From 


To 


Delay 


Unit 


1 


D 


PDERR 


50 


ns 


2 


PD 


PDERR 


50 


ns 


3 


DQCLK T 


A-EMPTY or A-FULL 


44 


ns 


4 


DQCLK T 


CNT 


46 


ns 


5 


DQCLK T 


EMPTY or FULL 


44 


ns 


6 


DQCLK T 


PYERR 


60 


ns 


7 


DQCLK T 


Y 


52 


ns 


8 


OE 


PYERR 


*P5 


ns 


9 


OE 


Y 


,.e,M,r.^TA 


ns 


10 


QCLK T 


A-EMPTY or EMPTY 


^^-''■^-^44 


ns 


11 


QCLK T 


CNT 


ef-'\ * 


ns 


12 


QCLK T 


A-FULL or FULL 


% y 44 


ns 


13 


RESET I 


A-FULL or FULL ^^> 


%: ■ ' ^ 


ns 


14 


RESET I 


CNT Jf i . 


' ' 46 


ns 


15 


RESET J, 


EMPTY or '■• ' 


% 44 


ns 


16 


RESET i 


PYERR fS«3«£,t 


60 


ns 


17 


RESET I 


Y . %.^-' 


52 


ns 


18 


RXMIT J, 


A-FULL or FULL ''^%>/'' 


44 


ns 


19 


RXMIT i 


CNT 4^%:' 


46 


ns 


20 


RXMIT I 


A-EMPTY or EMPTY f:.,'i %,,'"' 


44 


ns 


21 


RXMIT i 


PYERR j'. ,_• if 


60 


ns 


22 


RXMIT i 


Y .»*, -^-1. 


52 


.ns 


B. Setup and Hold Times 


No. 


Parameter 


For With Respect To 


Delay 


Unit 


23 


Bytes Dequeued Setup ^ 


. Bt» 


DQCLK T 


20 


ns 


24 


Bytes Dequeued Hold 'jj;* 


, BDQ 


DQCLK T 





ns 


25 


Bytes Queued Setup -i-_, " " 


V BQ 


QCLK i 


12 


ns 


26 


Bytes Queued Hold ' ' ^ 


BO 


QCLK T 




ns 


27 


Byte Swap Setup J « 


BSW 


QCLK T 


20 


ns 


28 


Byte Swap Hold '"^»','. ' J' 


BSW 


QCLK i 




ns 


29 


Data Setup 4' ' ' » ' 


D 


QCLK T 


8 


ns 


30 


Data Hold ._^., ; 


D 


QCLK T 




ns 


31 


Data Parity Setup * '.ji ' "^ 


PD 


QCLK T 


8 


ns 


32 


Data Parity Hold \^f "i' h* 


PD 


QCLK T 




ns 


33 


Dequeue Enable Setu^ _ ' %; 


DOEN 


DQCLK T 


8 


ns 


34 


Dequeue Enable Holfl \ *■•','% ;i' 


DQEN 


DQCLK T 





ns 


35 


Queue Enable Selk> ' i 


QEN 


QCLK I 




ns 


36 


Queue Enable ,^old * 


QEN 


QCLK T ■. 




ns 


./** 


C. Minimum Clock Requirements 


No. 


Input ';,,! 


Description 


Delay 


Unit 


37 


DQCLk 


Dequeue Min. Pulse Width LOW 


10 


ns 


38 


Dequeue Min. Pulse Width HIGH 


10 


39 


Dequeue Min. Cycle Time 


80 


40 


QCLK 


Queue Min. Pulse Width LOW 


10 


ns 


41 


Queue Min. Pulse Width HIGH 


10 


42 


Queue Min. Cycle Time 


80 


Notes: 1. Case temperature (Tc) = to +85°C, supply voltage (Voo) =5 V ±5%. It is the responsibility of the user to maintain a case 
temperature of +85°C or less. AI\^D recommends an air velocity of at least 200 linear feet per minute over the heatsink. 
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SWITCHING TEST CIRCUITS 




Rj = eK , 



TC001102 



Ra 



Ri 



5.0-Vbe-Vol 

lOL + VoL 
1K 



A. Three-State Outputs 



2.4 V 

lOH 

5.0-Vbe-Vol 

R, = 

Iql + Vql 

R2 

B. Normal Outputs 



Notes; 1 . Cl = 50 pF includes scope probe, wiring and stray capacitances without device in test fixture. 

2. Si, Sa, S3 are closed during function tests and all AC tests except output enable tests. 

3. Si and S3 are closed while S2 is open for tpzH test. 
Si and $2 are closed while S3 is open for tpzL test. 

4. Cl = 5.0 pF for output disable tests. 
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Test Philosophy and Methods 

The following points give the general philosophy that we apply 
to tests that must be properly engineered if they are to be 
implemented in an automatic environment. The specifics of 
what philosophies applied to which test are shown in the data 
sheet. 

1. Ensure the part is adequately decoupled at the test head. 
Large changes in Vcc cun-ent as the device switches may 
cause erroneous function failures due to Vcc changes. 

2. Do not leave inputs floating during any tests, as they may 
start to oscillate at high frequency. 

3. Do not attempt to perform threshold tests at high speed. 
Following an output transition, ground current may change 
by as much as 400 mA in 5-8 ns. Inductance in the ground 
cable may allow the ground pin at the device to rise by 
hundreds of millivolts momentarily. Current level may vary 
from product to product. 

4. Use extreme care in defining input levels for AC tests. 
Many inputs may be changed at once, so there will be 
significant noise at the device pins and they may not 
actually reach V|l or V|h until the noise has settled. AMD 
recommends using Vil < V and V|h > 3.0 V for AC tests. 

5. To simplify failure analysis, programs should be designed 
to perform DC, Function, and AC tests as three distinct 
groups of tests. 

6. Capacitive Loading for AC Testing 

Automatic testers and their associated hardware have stray 
capacitance that varies from one type of tester to another 
but is generally around 50 pF. This, of course, makes it 
impossible to make direct measurements of parameters 
that call for a smaller capacitive load than the associated 
stray capacitance. Typical examples of this are the so- 
called "float delays," which measure the propagation 
delays into the high-impedance state and are usually 
specified at a load capacitance of 5.0 pF. In these cases, 
the test is performed at the higher load capacitance 
(typically 50 pF) and engineering correlations based on 
data taken with a bench set up are used to predict the 
result at the lower capacitance. 

Similarly, a product may be specified at more than one 
capacitive load. Since the typical automatic tester is not 
capable of switching loads in mid-test, it is impossible to 
make measurements at both capacitances even though 
they may both be greater than the stray capacitance. In 



these cases, a measurement is made at one of the two 
capacitances. The result at the other capacitance is 
predicted from engineering correlations based on data 
taken with a bench setup and the knowledge that certain 
DC measurements (loH. lOL. 'or example) have already 
been taken and are within spec. In some cases, special DC 
tests are performed in order to facilitate this correlation. 

7. Threshold Testing 

The noise associated with automatic testing (due to the 
long, inductive cables), and the high gain of the tested 
device when in the vicinity of the actual device threshold, 
frequently give rise to oscillations when testing high-speed 
circuits. These oscillations are not indicative of a reject 
device, but instead, of an overtaxed test system. To 
minimize this problem, thresholds are tested at least once 
for each input pin. Thereafter, "hard" high and low levels 
are used for other tests. Generally this means that function 
and AC testing are performed at "hard" input levels rather 
than at V|l Max. and V|h Min. 

8. AC Testing 

Occasionally parameters are specified that cannot be 
measured directly on automatic testers because of tester 
limitations. Data input hold times often fall into this catego- 
ry. In these cases, the parameter in question is guaranteed 
by correlating these tests with other AC tests that have 
been performed. These correlations are arrived at by the 
cognizant engineer using data from precise bench meas- 
urements in conjunction with the knowledge that certain DC 
parameters have already been measured and are within 
spec. 

In some cases, certain AC tests are redundant since they 
can be shown to be predicted by other tests that have 
already been performed. In these cases, the redundant 
tests are not performed. 

9. Output Short-Cirouit Current Testing 

When performing Iqs tests on devices containing RAM or 
registers, great care must be taken that undershoot caused 
by grounding the high-state output does not trigger parasit- 
ic elements which in turn cause the device to change state. 
In order to avoid this effect, it is common to make the 
measurement at a voltage (Voutput) that is slightly above 
ground. The Vcc 'S raised by the same amount so that the 
result (as confirmed by Ohm's law and precise bench 
testing) is identical to the Vqut " 0. Vcc = Max. case. 
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SWITCHING WAVEFORMS 
KEY TO SWITCHING WAVEFORMS 



WAVEFORM INPUTS 



IML 



WILL BE 
CHANGING 
FROM H TO L 



WILL BE 

CHANGING 
FROM L TO H 



W€ 



DON'T CARE; CHANGING. 

ANV CHANGE STATE 

PERMITTED UNKNOWN 



CENTER 
DOES NOT LINE IS HIGH 

AFfLY IPrtPEDANCE 

"OFF" STATE 



QCLK 



^^ 



''0-31 



ESSHSSffiX 



PD 



0-3 



QEN 



ESSSSSmx 



g> 



\ 



i.^ooodc 



BSW, 



FULL/A FULL 



-@- 



r 



®- 



®- 



-® 



EMPTY/A_EMPTY 



CNT, 



0-6 



SESESESaSSSM 



PDERR 



♦"♦I—® 



■<S) 



)Kssffims 



-^ 



• mCMCMOCX 



I 



*-@-». 



XffiffiffiSffi 






J 



\ 



-<D 



-<I>- 



1 



Queue Cycle 
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SWITCHING WAVEFORMS (Cont'd.) 



DQCLK 
DQEN 



1. 



-&- 



J 



\ 



*-®-^ 



Boooa yyyyyyxyxxm : 






OE 

^0-31/ 

PY0.3 

PYERR 



yyyyxi<xxxxxxxm 35 ixyymyxifl?a f 



Full/A-FUl 
Empty/A-Empty 



\ 



-<D- 



-CD- 



-<l> 



1. 



-®- 



T 



K4> 



CNT 



0-6 



Dequeue Cycle 



RESET 



Empty/A-Empty 



Full/A-Full 



CNTo.e 

^0-31/ 

PY 

'^'0-3 

PYERR 



■^. 



4 <j) » 



y^ 



>^ 



-®- 



1 



-®- 



1. 



-©- 



v 



RESET Timing Diagram 



Notes: 1. Minimum time RESET must be asserted. 



2. Thils timing diagram is applicable to RXMIT. 
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SWITCHING WAVEFORMS (Cont'd.) 



OUTPUTS 



:a f 



CLOCK 



A ^ 



INPUT 
_ TO . 
OUTPUT 
TO DELAY 
"OUTPUT 
DELAY 



INPUT/OUTPUT CIRCUIT DIAGRAM 



DRIVING OUTPUT 



'oh 



< 



< 



»0L 



DRIVEN INPUT 



l|L 



+^ 



< 



l|H 



^^ 



ICR00480 
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CHAPTER 4 



Arithmetic Processors 

Am29C323 CMOS 32-Bit Parallel Multiplier 4-1 

Am29325 32-Bit Floating-Point Processor 4-24 

Am29G325 CMOS 32-Bit Floating-Point Processor 4-78 

Am29C327 CMOS Double-Precision Floating-Point Processor 4-133 



Am29C323 

CMOS 32-Bit Parallel Multiplier 



Zi 



PRELIMINARY 



DISTINCTIVE CHARACTERISTICS 



32-Bit Three-Bus Architecture 

- The device has two 32-bit input ports and one 32-bit 
output port with clocl^ed multiply time of 100 ns 

Speed Selects 

- 80- and 55-ns speed-select parts 
Single Cloclc with Register Enables 

- The Am29C323 is controlled by one clock with 
individual register enables 

Supports Multiprecision Multiplication 

- The device has dual 32-bit registers on each data 
input port to perform multiprecision multiplication 



Registers can be made transparent 

- Input and output registers can be made transparent 
independently to eliminate unwanted pipeline delay 

Supports Two's Complement, Unsigned or Mixed 
Numbers 

Data Integrity Through Master-Slave Mode and Pari- 
ty Checl</Generate 

- Parity check/generate catches inter-device 
connection en-ors and master/slave mode provides 
complete function check 



> 
i 

M 

o 

Cd 
lO 

CO 



GENERAL DESCRIPTION 



The Am29C323 is a high-speed 32 x 32-Bit CMOS Parallel 
Multiplier with 67-Bit Accumulator. The part is designed to 
maximize system level performance by providing a 32-bit 
three bus architecture and a single clock with register 
enables. 

The Am29C323 further enhances system throughput by 
providing individual register feedthrough controls, byte 
parity checking on both input ports and generation on the 
output port, and dual input registers on each data input bus 
to support multiprecision multiplication. The Am29C323 can 
manage a wide variety of data types, including two's 



complement, unsigned, or mixed mode input formats. A 
64 X 64-bit multiplication can be performed in seven clock 
cycles, including input and output. Additional features 
provided are a format adjust control allowing for standard 
output or left shifted output suitable for fractional two's 
complement arithmetic, rounding, and master/slave opera- 
tion. 

The Am29C323 is designed in low-power, high-speed 
CMOS with TTL-compatible I/O. The device Is housed in a 
169-lead pin-grid-array package. 
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SIMPLIFIED BLOCK DIAGRAM 

..■BUS Y.BUS 

fa Jf^ 



BREsJ I Y A RES | | VB REC [ | cheS I E^PIERR 



I SHIFTER I 

.K,. — , /i 



B7-BIT ADDER 



£ 



PRODUCT REG 



/« 
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Publication # Rev. Amendment 

07830 B /O 

Issue Date: August 1987 



RELATED AMD PRODUCTS 



Part No. 


Description 


Am29C01 


CMOS 4-Bit Microprocessor Slice 


Am29C10A 


CMOS 12-Bit Sequencer 


Am29C101 


CMOS 16-Bit Microprocessor 


Am29112 


8-Bit Cascadable Microprogram Sequencer 


Am29114 


Real-Time Interrupt Controller 


Am29C116 


CMOS 16-Bit Microcontroller 


Am29325 


32-Bit Floating Point Processor 


Am29C325 


CMOS 32-Bit Floating Point Processor 


Am29331 


16-Bit Microprogram Sequencer 


Am29C331 


CMOS 16-Bit Microprogram Sequencer 


Am29332 


32-Bit Extended Function ALU 


Am29C332 


CMOS 32-Bit Extended Function ALU 


Am29334 


64x18 Four-Port Dual Access Register RIe 


Am29C334 


CMOS 64x18 Four-Port Dual Access Register File 


Am29337 


16-Bit Bounds Checker 


Ann29338 


32-6it Byte Queue 


Am29C516 


CMOS 16x16 Multiplier 


Am29C517 


CMOS 16x16 Multiplier with Separate I/O 



DETAILED BLOCK DIAGRAM 



PVO- 



-/- 



CLK ■ 

Inxa.enxb ' 
enya,€n^ 

EW ' 
EMP • 

SnT 

FA ' 
TSEL • 
PSELO ' 
PSEL> ' 

SLAVE ' 



PARUy 

CHECK 



MULTIPLEXER 



I MUX I 





-•32 



PX 

i 



zi;=D — ptn^' 



I XA HEG 

I MULTIPLEXER ^ 

33 X 33 

MULTin.lER p 

ARRAY 

I" 

I 6T-B1T ADDER 



PRODUCT REGISTER 



iTEftlPORAnV REG I 



FROM PARERR 



/^ 



MULTIPLEXER 



,'64 



■ XSEL 

- YSEL 

■ TCX 
• TCV 

■ ACCO 

■ ACCi 

- AND 



- FTX 

- FTY 



BD003049 
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CONNECTION DIAGRAM 
169-Lead PGA 
Bottom View 



DEFSHJKLMNP RT 



"^3 


^31 


^30 


•'2S 


^24 


^23 


^20 


^19 


^,5 


^10 


^11 


^ 


^5 


^3 


GNO 


ENYA 


""3 


QNO 


NC 


Y„ 


V 


PYj 


^2, 


''18 


I'le 


"ifi 


1^12 


^0» 


PVo 


li 


^2 


> 


ENYB 


"31 


NC 


PRERR 


NC 


^28 


^25 


^22 


^oc 


^,7 


Vm 


^13 


GNO 


^8 


Y« 


f^ 


FTY 


'<30 


GNO 


Vcc 


""3 


NC 


* 






















Xj, 


"28 


"27 


"30 


P31 


P29 
























)'2S 


=^4 


"25 


GNO 


P28 


"27 
























PXj 


"23 


"22 


"iS 


P2« 


QNO 
























Vcc 


"19 


"21 


Vco 


P2« 


PPj 
























X18 


"17 


"20 


NC 


NC 


"23 
























"is 


"is 


PX, 


GNO 


P22 


"2, 
























"14 


"13 


"11 


Pi 9 


"20 


"cc 
























GNO 


"10 


"12 


Pis 


P18 


",7 
























X9 


"a 


PXO 


Vcc 


HOERR 


FTP 
























X7 


"5 


"s 


NC 


Inp 


NC 
























Xj 


"3 


"4 


iiff 


31 


smvE 


"15 


"n 


"12 


GNO 


"7 


"s 


"3 


Vcc 


"1 


Ft! 


TCY 


FIX 


"1 


"0 


PSEL 


FA 


PP, 


"14 


")3 


"» 


GNO 


""0 


"s 


"4 


^cc 


B« 


CLK 


ACC1 


TCX 


FNxn 


BI'XA 


PSELO TSEL 


GNO 


NC 


^cc 


"10 


GNO 


"3 


GNO 


"2 


Vcc 


"0 


^cc 


RNO 


ACCO YSEL 


XSEL 



*Pinout Observed from pin side of pacloge. 
"Pin 169 for reference only. 
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PIN DESIGNATIONS 

(Sorted by Pin Number) 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


1 


A1 


PYs 


75 


C9 


P23 


55 


J1S 


Pe 


117 


R10 


Xl4 


168 


A2 


GND 


72 


CIO 


P2I 


51 


J16 


P5 


116 


R11 


GND 


83 


A3 


NC 


74 


C11 


Vcc 


135 


J17 


GND 


36 


R12 


Xg 


81 


A4 


Vcc 


153 


C12 


Pl7 


14 


K1 


Y10 


121 


R13 


X7 


80 


AS 


Pso 


151 


C13 


FTP 


13 


K2 


Y12 


40 


R14 


X2 


79 


A6 


GND 


66 


C14 


NC 


96 


K3 


Y13 


125 


R15 


FTX 


160 


A7 


P25 


146 


CIS 


SLAVE 


50 


K1S 


P3 


128 


R16 


TCX 


77 


AS 


Vcc 


145 


C16 


PP1 


134 


K16 


P4 


45 


R17 


ACCO 


157 


A9 


NC 


61 


C17 


GND 


133 


K17 


P2 


105 


T1 




ENYA 


71 


A10 


GND 


4 


D1 


Y26 


97 


LI 


Y11 


21 


T2 




ENYB 


154 


A11 


Pl9 


87 


D2 


Y27 


98 


L2 


Y9 


107 


T3 


X30 


69 


A12 


P16 


3 


D3 


Y28 


95 


L3 


GND 


108 


T4 


X28 


68 


A13 


Vcc 


62 


D1S 


Pl5 


53 


L1S 


Vcc 


110 


TS 


X24 


67 


A14 


NC 


144 


D16 


Pu 


53 


L16 


Vcc 


111 


T6 


X23 


65 


A15 


ENT 


60 


D17 


NC 


53 


L17 


Vcc 


113 


T7 


Xl9 


148 


A16 


PSEL1 


5 


El 


Y24 


16 


Ml 


Y7 


114 


T8 


Xl7 


64 


A17 


PSELO 


89 


E2 


PY2 


99 


M2 


PYo 


31 


T9 


X16 


85 


B1 


Y31 


88 


E3 


Y25 


15 


M3 


Y8 


34 


T10 


X13 


84 


B2 


NC 


142 


E1S 


P1I 


132 


MIS 


Pi 


119 


T11 


X10 


166 


B3 


PRERR 


143 


E16 


Pl3 


47 


MIS 


ENI 


120 


T12 


Xe 


165 


B4 


PP3 


57 


E17 


Vcc 


48 


M17 


Po 


122 


T13 


X5 


164 


B5 


P3I 


6 


F1 


Y23 


17 


N1 


Ys 


123 


T14 


X3 


162 


B6 


P28 


7 


F2 


Y21 


101 


N2 


Y4 


124 


T15 


Xi 


161 


B7 


P26 


90 


F3 


Y22 


100 


N3 


Ye 


42 


T16 




ENXB 


76 


BS 


P24 


59 


F1S 


Piz 


130 


N1S 


FTI 


127 


T17 


YSEL 


73 


B9 


NC 


141 


F16 


P9 


131 


N16 


CLK 


22 


U1 


PX3 


156 


B10 


P22 


58 


F17 


P10 


49 


N17 


Vcc 


106 


U2 


X31 


155 


B11 


P20 


91 


G1 


Y20 


18 


PI 


Y3 


23 


U3 


GND 


70 


B12 


P18 


92 


G2 


Y18 


102 


P2 


Y2 


25 


U4 


X27 


152 


B13 


HDERR 


11 


G3 


Vcc 


19 


P3 


Y1 


26 


US 


X26 


150 


B14 


ENP 


137 


G15 


GND 


44 


PIS 


TCY 


28 


U6 


X22 


149 


B1S 


^ 


137 


G16 


GND 


129 


P16 


ACC1 


112 


U7 


X21 


63 


B16 


FA 


137 


G17 


GND 


46 


P17 


RND 


29 


U8 


X20 


147 


B17 


TSEL 


8 


H1 


Y19 


20 


R1 


GND 


115 


U9 


PXl 


2 


C1 


Y30 


93 


H2 


Y16 


103 


R2 


Yo 


35 


U10 


X11 


86 


C2 


Y29 


9 


H3 


Yl7 


104 


R3 


FTY 


118 


U11 


X12 


167 


C3 


NC 


139 


H15 


P7 


24 


R4 


X29 


37 


U12 


PXo 


82 


C4 


NC 


56 


H16 


PPo 


109 


RS 


X26 


38 


U13 


X6 


163 


C5 


P29 


140 


H17 


P8 


27 


R6 


PX2 


39 


U14 


X4 


78 


C6 


P27 


94 


J1 


Y15 


32 


R7 


Vcc 


41 


U1S 


Xo 


158 


C7 


GND 


10 


J2 


PYi 


30 


RB 


X18 


126 


U16 




ENXA 


159 


C8 


PP2 


12 


J3 


Yi4 


33 


R9 


X15 


43 


U17 


XSEL 


1 
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PIN DESIGNATIONS 

(Sorted by Pin Name) 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


PAD 
NO. 


PIN 
NO. 


PIN NAME 


45 


R17 


ACCO 


50 


K15 


P3 


89 


E2 


PY2 


110 


T5 


X24 


129 


PI 6 


ACC1 


134 


K16 


P4 


1 


A1 


PY3 


26 


U5 


X25 


131 


N16 


CLK 


51 


J16 


P5 


46 


P17 


RNO 


109 


R5 


X26 


47 


M16 


EnT 


55 


J15 


Pe 


146 


015 


SLAVE 


25 


U4 


X27 


150 


B14 


ENP 


139 


HI 5 


P7 


128 


R16 


TCX 


108 


T4 


X28 


65 


A15 


ENT 


140 


m7 


Ps 


44 


PI 5 


TCY 


24 


R4 


X29 


126 


U16 




141 


F16 


P9 


147 


817 


TSEL 


107 


T3 


X30 


ENXA 


42 


T16 




58 


F17 


PlO 


68 


A13 


Vcc 


106 


U2 


X31 


ENXB 


105 


T1 




142 


E15 


Pl1 


81 


A4 


Vcc 


43 


U17 


XSEL 


ENYA 


21 


T2 




59 


F15 


Pl2 


77 


A8 


Vcc 


103 


R2 


Yo 


ENYB 


63 


B16 


FA 


143 


E16 


Pl3 


74 


C11 


Vcc 


19 


P3 


Yl 


130 


N15 


FT! 


144 


D16 


Pl4 


57 


E17 


Vcc 


102 


P2 


Y2 


151 


C13 


FTP 


62 


D15 


Pl5 


11 


G3 


Vcc 


18 


PI 


Y3 


125 


R15 


FTX 


69 


A12 


Pl6 


53 


L15 


Vcc 


101 


N2 


Y4 


104 


R3 


FTY 


153 


C12 


Pl7 


53 


LI 6 


Vcc 


17 


N1 


Y5 


71 


A10 


GND 


70 


B12 


P18 


53 


LI 7 


Vcc 


100 


N3 


Ye 


168 


A2 


GND 


154 


A11 


Pl9 


49 


N17 


Vcc 


16 


Ml 


Y7 


79 


A6 


GND 


155 


811 


P20 


32 


R7 


Vcc 


15 


M3 


Y8 


61 


CI 7 


GND 


72 


C10 


P21 


41 


U15 


Xo 


98 


L2 


Y9 


158 


C7 


GND 


156 


810 


P22 


124 


T15 


Xl 


14 


K1 


Y10 


137 


G15 


GND 


75 


C9 


P23 


40 


R14 


X2 


97 


LI 


Y11 


137 


G16 


GND 


76 


88 


P24 


123 


T14 


Xa 


13 


K2 


Y12 


137 


G17 


GND 


160 


A7 


P25 


39 


U14 


X4 


96 


K3 


Yl3 


135 


J17 


GND 


161 


87 


P26 


122 


T13 


X5 


12 


J3 


Yl4 


95 


L3 


GND 


78 


C6 


P27 


38 


U13 


Xe 


94 


J1 


Yl5 


20 


R1 


GND 


162 


86 


P28 


121 


R13 


X7 


93 


H2 


Y16 


116 


R11 


GND 


163 


C5 


P29 


120 


T12 


Xs 


9 


H3 


Yl7 


23 


U3 


GND 


80 


A5 


P30 


36 


R12 


X9 


92 


G2 


Y18 


152 


B13 


HDERR 


164 


85 


P3I 


119 


Til 


X10 


8 


HI 


Yl9 


157 


A9 


NC 


166 


83 


PRERR 


35 


U10. 


X11 


91 


G1 


Y20 


60 


D17 


NC 


56 


H16 


PPo 


118 


U11 


X12 


7 


F2 


Y2I 


73 


89 


NC 


145 


C16 


PPl 


34 


T10 


Xl3 


90 


F3 


Y22 


82 


C4 


NC 


159 


C8 


PP2 


117 


RIO 


Xl4 


6 


F1 


Y23 


83 


A3 


NC 


165 


84 


PPa 


33 


R9 


Xl5 


5 


El 


Y24 


84 


B2 


NC 


64 


A17 


PSELO 


31 


T9 


X16 


88 


E3 


Y25 


66 


C14 


NC 


148 


A16 


PSEL1 


114 


T8 


Xi7 


4 


D1 


Y26 


167 


C3 


NC 


37 


U12 


PXo 


30 


R8 


X18 


87 


D2 


Y27 


67 


A14 


NC 


115 


U9 


PXi 


113 


T7 


X19 


3 


D3 


Y28 


149 


B15 


OE 


27 


R6 


PX2 


29 


U8 


X20 


86 


C2 


Y29 


48 


Ml 7 


Po 


22 


U1 


PX3 


112 


U7 


X21 


2 


CI 


Y30 


132 


M15 


P1 


99 


M2 


PYo 


28 


U6 


X22 


85 


B1 


Y31 


133 


K17 


P2 


10 : J2 


PYi 


111 


T6 


X23 


127 


T17 


YSEL 


1 
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LOGIC SYMBOL 



PY0-PY3 Y0-Y31 X0-X31 PX0-PX3 



■4^ 

2/ 



^ 



4^ 

2/ 



4^ 



-^ 



CLK 

ENXA, ENXB 

ENYA, ENYB 

ENI 

ifJp.BJT 

FA 

TSEL 

PSELO, PSEL1 

OE 

SLAVE 

XSEL, YSEL 

TCX, TCY 

ACCO, ACC1 

RND 

FTX, FTY, FTI 

FTP 



PRERR 
HDERR 



P0-P31 



PP0-PP3 



-'4 
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ORDERING INFORMATION 
Standard Products 



AMD standard products are available in several packages and operating ranges. The order number (Valid Combination) is 
formed by a combination of: a. Device Number 

b. Speed Option (if applicable) 

c. Pacltage Type 

d. Temperature Range 

e. Optional Processing 



AM29C323 



-e. OPTIONAL PROCESSING 

Blank = Standard processing 
B = Burn-in 



- d. TEMPERATURE RANGE 

C- Commercial (0 to +70°C) 



-c PACKAGE TYPE 

G = 169-Lead Pin Grid Array without Healsink 
(CGX169) 



-a. DEVICE NUMBER/DESCRIPTION 

Am29C323 

CMOS 32-Bit Parallel Multiplier 



Valid Combinations 


AM29C323 


GC, GCB 


AM29C323-1 


AM29C323-2 



-b. SPEED OPTION 

-1 -80 ns 
-2-55 ns 



Valid Combinations 

Valid Combinations list configurations planned to be 
supported in volume for this device. Consult the local AMD 
sales office to confirm availability of specific valid 
combinations, to check on newly released combinations, and 
to obtain additional data on AMD's standard military grade 
products. 
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ORDERING INFORMATION 
APL Products 

AMD products for Aerospace and Defense applications are available in several packages and operating ranges. APL (Approved 
Products List) products are fully compliant with MIL-STD-883C requirements. The order number (Valid Combination) for APL 
products is formed by a combination of: a. Device Number 

b. Speed Option (if applicable) 

c. Device Class 
Package Type 

, Lead Finish 



d. 
e. 



AM29C323 



/B 



-a. DEVICE NUMBER/DESCRIPTION 

Am29C323 

CMOS 32-Bit Parallel Multiplier 



-e. LEAD FINISH 

C - Gold 



-d. PACKAGE TYPE 

Z-169-Lead Pin Grid Array without Heatsink 
(CGX169) 



-C. DEVICE CLASS 

/B - Class B 



b. SPEED OPTION 

Not Applicable 



Valid Combinations 



AM29C323 



/BZC 



Valid Combinations 

Valid Combinations list configurations planned to be 
supported in volume for this device. Consult the local AMD 
sales office to confirm availability of specific valid 
combinations or to checK for newly released valid 
combinations. 



Group A Tests 

Group A tests consist of Subgroups 
1, 2, 3, 7, 8, 9, 10, 11. 
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PIN DESCRIPTION 



ACCO, ACC1 Accumulator Control (Input) 

Accumulator control lines used to determine accumulator 
function; PASS, ACCUMULATE, and SHIFT and 
ACCUMULATE. 

CLK Clock (Input) 

Clock input for all registers. 
ENT Instruction Register Enable (Input; Active LOW) 

Register enable for instruction register I. 
INP Accumulator Register Enable (Input; Active 
LOW) 

Register enable for product register P. 

ENT Temporary Register Enable (Input; Active LOW) 

Register enable for temporary register T. 
Inxa, ENXB Multiplicand Register Enable (input; 
Active LOW) 

Register enables for multiplicand data input registers XA 

and XB. 
E?JYA, iNYi Multiplier Register Enable (Input; 
Active LOW) 

Register enables for multiplier data input registers YA and 

YB. 

FA Format Adjust (Input) 

Format adjust selects either a full 64-bit product (HIGH) or a 

left shifted 63-bit product suitable for fractional two's 

complement arithmetic (LOW). 
FTP Feedthrough Control (Input; Active HIGH) 

Feedthrough control for product register. 
FTX, FTY, FTI Feedthrough Control (input Active HIGH) 

Feedthrough control lines for X, Y, and I registers. 

HDERR Hard Error Flag (Output) 

Used when two Am29C323s are configured as master and 
slave to indicate hardware errors. 



OE Output Enable Control (Input; Active LOW) 

Used to enable (LOW) or disable (HIGH) the P output port. 
P0-P31 Product Output (Input/Output; Three State) 

Product output for P port. 



PRRER Parity Error Flag (Input/Output; Three 
State) 

Indicates a parity error on the input buses. 

PP0-PP3 Byte Parity (Input/Output; Three State) 

Byte parity generated on P output port (even parity). 

PSELO, PSEL1 Product Control (Input) 

Used to select desired output including disabling P and PP 
output ports. 

PX0-PX3 Byte Parity (Input) 

Byte parity inputs on X input port (even parity). 

PYo - PY3 Byte Parity (Input) 

Byte parity inputs on Y input port (even parity). 

RND Round Control (Input; Active HIGH) 

Round control for rounding the most significant product. 



SLAVE Master/Slave Control (Input) 

Used to determine mode of operation. 

TCX, TCY Mode Control (Input) 

Mode control inputs for each input data word; LOW for 
unsigned data and HIGH for two's complement format. 

TSEL Select Control (Input) 

Used to route the most significant product register (HIGH) or 
the least significant product register (LOW) into the 
temporary register. 

X0-X31 Multiplicand Data (input) 

Multiplicand data input for X port. 

XSEL X Register Select (Input) 

Control line used to route the contents of either the XA 
register (HIGH) or XB register (LOW) into the multiplier 
array. 
Y0-Y31 Multiplier Data (Input) 
Multiplier data input for Y port. 

YSEL Y Register Select (Input) 

Control line used to route the contents of either the YA 
register (HIGH) or YB register (LOW) into the multiplier 
array. 



FUNCTIONAL DESCRIPTION 

Architecture 

The Am29C323 comprises a high speed 32 by 32-bit multiplier 
array, a 67-bit accumulator, and a 32-bit data path. 

Multiplier Array 

The multiplier is a 32 by 32-bit array that produces a 64-bit 
product. This product is then fed to the accumulator section. 

Accumulator 

The accumulator is 67 bits wide. It performs accumulation for 
sum of product operations and multiprecision multiplication 
operations. The accumulator can perform three operations: 
store product without accumulation, accumulate product, and 
shift accumulator value and accumulate with product. 

The shift and accumulate shifts the value in the product 
register 32 bits to the right (effectively moving the most 
significant 32 bits to the least significant 32 bits) and sign 
extends to a full 64 bits. This shifted value is then accumulated 
with the output of the multiplier array. 



The 67-bit width is necessary to contain overflows in Internal 
accumulations. These overflows are maintained and used 
when the product register is right shifted in the multiprecision 
multiplies. The lower 64 bits contain the 64-bit output while the 
upper 3 bits contain the overflow. 

Data Path 

The 32-bit data path consists of X and Y input buses; the P 
output bus; data registers XA, XB, YA, YB, and the product 
accumulator; two multiplier input multiplexers; byte parity input 
checkers; byte parity output generators; and master/slave 
comparators. Input operands enter the device through the two 
32-bit input buses, X0-X31 and Y0-Y31. These operands 
may then be stored in one of the two registers for each bus 
(XA or XB for X, YA or YB for Y) or they may be fed directly 
through to the multiplier array. Input parity checking is per- 
formed as soon as the operands are put on the input buses. 
The signals used for output parity generation are taken from 
the input side of the output translator. In case of parity error, 
PRERR is enabled HIGH. 
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Operational Modes 

The Am29C323 can perform signed, unsigned, or mixed mode 
multiplication. These different numerical representations are 
controlled by TCX and TCY. A HIGH input on one of these 
lines indicates to the device that the respective input should 
be treated as a two's complement number; a LOW, an 
unsigned number. The output format is unsigned when both 
inputs are unsigned. The output format is two's complement 
when either or both inputs are two's complement. 

Slave Mode 

Each output has an associated comparator which compares 
the signal on the output pin with the signal provided to the 
output driver. If any of these outputs do not agree, the HDERR 
is asserted. When not in slave mode, this enables the 
multiplier to check for contention and bus shorts. However, 
when in slave mode, one multiplier can be used to detect 
faults in both internal functions and interconnections of the 
other multiplier. This is accomplished through the master/ 
slave configuration, where the two multipliers operate in 
parallel. One multiplier is the master and operates normally; 
the other operates in slave mode. 

In slave mode all outputs are turned into inputs from the 
master, except for the HDERR signal. Since the slave is 
operated in parallel with the master, it can compare the results 
it generates to those of the master and signal an error if they 
differ. 

Command Description and Formats 

The accumulator is controlled by ACCO and ACG1. These 
lines are used to select any of the three operations that the 
accumulator can perform. This instmction set is described in 
Table 1. 

The temporary output register is controlled by TSEL and FA. 
These lines are used to select any of the four different sets of 
data that can be stored in the temporary register. This 
instruction set is described in Table 2. 

The output multiplexer is controlled by PSELO, PSEL1, and 
FA. These lines are used to select any of the five different sets 
of data that can be output through the P port. PSELO and 
PSEL1 can also be used to_disable the outputs. (This 
instruction is independent of OE.) This instruction set is 
described in Table 3. 

Format Adjust (FA) is used to select either a full 64-bit product 
or a left-shifted 63-bit product suitable for fractional two's 
complement arithmetic. This shifting increases the precision of 
the upper half of the product word by eliminating the redun- 
dant sign bit. Output Data Formats show the effect of FA. 

Round (RND) is used to round the upper 32 bits of the 64-bit 
product If only the upper 32 bits of the product are being 
used, then the lower 32 bits are truncated when rounding is 
not used (RND = 0). If rounding is used (RND = 1), then a "1" 
is added to the most significant of the lower 32 bits. This 



results in a smaller possible error. This should only be used 
when the lower 32 bits are to be truncated. 

User Visible Register Descriptions 

The Am29C323 contains seven different register sets, each 
with its own clock enable. Two 32-bit registers are attached to 
each of the input data buses. These registers are differentiat- 
ed by the suffix A or B. For example, the X bus has registers 
XA and XB. The 67-bit accumulator register can be used as a 
regular product register when the part is used as a multiplier 
only or as the register part of the accumulator section. The 32- 
bit temporary output register is included to aid in the pipelining 
of multiprecision operations. An instruction register is also 
provided. 

All of these registers can be made transparent with the 
exception of the accumulator register and the temporary 
register. The product from the multiplier can be fed directly to 
the output by using the FTP control line. 

TABLE 1. ACCUMULATOR OPERATION 
INSTRUCTIONS 



ACC1 


ACCO 


Accumulator Operation 








PASS 





1 


ACCUMULATE 


1 





INVALID 


1 


1 


SHIFT AND ACCUMULATE 



TABLE 2. INPUT SELECT INSTRUCTIONS FOR 
TEMPORARY (T) REGISTER 



TSEL 


FA 


Temp Reg Input 








Pi-1 





1 


Pi 


1 





Pi + 31 


1 


1 


Pi + 32 



TABLE 3. OUTPUT SELECT INSTRUCTIONS FOR 
PRODUCT (P) PORT 



PSEL1 


PSELO 


FA 


P Port Output 








X 


TEMP REGISTER 





1 





Pi-1 





1 


1 


Pi 


1 








Pi + 31 


1 





1 


Pi + 32 


1 


1 


X 


DISABLE 



4-10 



Am29C323 X AND Y INPUT DATA FORMATS 
Fractional Two's Complement 

TCX, TCY = 1 



31 30 29 28 27 26 ----- 3 2 1 

-2° 2"^ 2"^ 2"^ 2""* 2"^ 2"^^ 2"^^ 2~^° 2"^^ 



Integer Two's Complement 
TCX, TCY = 1 



31 30 29 28 27 26 - - - - - 3 2 1 

"^gSI 2^0 p9 2^8 p^ ^26 2^ 2^ JT 2° 

Unsigned Fractional 

TCX, TCY = 

31 30 29 28 27 26 -----3210 

2-29 2-30 2^3'' 2"32 

Unsigned Integer 

TCX, TCY = 

31 30 29 28 27 26 ----- 3 2 1 

"^31 ^0 p9 ^28 p7 ^26 p ^2 ^1 2° 
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Am29C323 P-PORT OUTPUT DATA FORMATS 
Fractional Two's Complement (Shifted)* 

FA = 0, PSEL1 = 1, PSELO = 



31 30 29 28 27 26 - - - - _ 3 2 1 



-2° 2"^ 2-2 2"^ T'^ 2-5 2-28 g-SS 2'^° T^'^ 
FA = 0, PSEL1 = 0, PSELO = 1 



31 30 29 28 27 26 ----- 3 2 1 



2-32 2-33 2-34 2-35 ^-%^ ^-Zl 2-60 2-6I 2-^^ 2-63* 

Fractional Two's Complement 

FA = 1, PSEL1 = 1, PSELO = 



31 30 29 28 27 26 - - - - - 3 2 1 



-2^ 2° 2"'' 2-2 2-3 t"' 2-2^ 2-28 j-^S 2-3" 

FA = 1, PSEL1 = 0, PSELO = 1 



31 30 29 28 27 26 ----- 3 2 1 



2-31 2-32 2-33 2-3'» 2-35 g-SS 2-59 2-6O 2"®^ 2-^2 

Integer Two's Complement 

FA=1, PSEL1 = 1, PSELO = 



31 30 29 28 27 26 - - - - - 3 2 1 



_263 262 2®'' 2®° 25^ 256 235 234 233 ,^i 

FA = 1, PSEL1 = 0, PSELO = 1 



31 30 29 28 27 26 ----- 3 2 1 



231 ^30 229 pa p7 p6 ^ ^ ^i ^ 

Unsigned Fractional 

FA = 1, PSEL1 = 1, PSELO = 



31 30 29 28 27 26 ----- 3 2 1 



2-1 2-2 2-3 2-" 2-5 2-^ 2~'^ 2-30 2-31 2-32 

FA = 1, PSEL1 = 0, PSELO = 1 



31 30 29 28 27 26 ----- 3 2 1 



2-33 2-34 2-35 2-36 ^-Zl j-Se 2-61 3-62 2-®3 2-64 

Unsigned Integer 

FA = 1, PSEL1 = 1, PSELO = 



31 30 29 28 27 26 



3 2 10 



263 262 2^^ 26O 259 258 235 234 233 232 

FA = 1, PSEL1 = 0, PSELO =1 



31 30 29 28 27 26 ----- 3 2 1 



231 £30 229 228 ^^1 ^2& ^3 ^ ^i ^ 

"In this format, an overflow occurs in the attempted multiplication of the two's complement number -1.000 with Itself, yielding a 
product of +1.000 which cannot be represented in this format. "This bit position (2"63) equals zero in this format. 
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64 X 64 Multiplication different form, is shown witfi the necessary instructions below: 

To perform a 64 x 64-bit multiplication using the Am29C323, x •» XW1 XWO 
each 64-blt input must be split into two 32-bit inputs; a most y ■*■ * YW1 YWO 
significant half and a least significant half (XW1 and XWO or ^ ,, „■ , 
mi and YWO, respectively). These 32-bit inputs are then ^^^ ^ ^0 * ^WO *Mu ,p^y only 
used to perform the four multiplications needed to obtain the XW1 -YWO * Mu & Shift/ Aoc 
128.bit product. This product'is represented in four 32-bit . ^^0 • YW1 ---Mu & Accumulate 

....... OVA;^ DWn,h.,»==tcin„ifin=„,»nrHh»innPW„Th« XWt-YWt ♦ Mult & Shrft/AcC 

product is output 32 bits at a time through the product (P) port. p* PW3 PW2 PW1 PWO 
The following equation shows the required multiplications: 

X . Y = {(XW1 * YW1) * 2®'*) -^ ((XWO • YW1) • 2^^) Table 4 details the movement of the input operands through 
+ ((XW1 ' YWO) • 2^^) + ((XWO * YWO) • 2")) "^^ Am29C323. Table 5 defines the microcode required to 

„„ perform a signed 64 x 64-bit multiplication. For an unsigned 

P = (PW3 . 2^6) + (PW2 . 2^) + (PW1 • 2-^'^) multiplication, TCX and TCY are LOW for all cycles. The 

+ (PWO * 2 ) operations and data movement are scheduled to produce a 

The Am29C323 uses an internal accumulator to sum these * single product in seven clock cycles or a new pipelined 

intermediate products. The previous equation, in a slightly product every four clock cycles. 


TABLE 4. BUS AND REGISTER CONTENTS FOR A 64 x 64-BIT SIGNED MULTIPLICATION WITH ONE 
COMPLETE EXTENDED MULTIPLICATION SHOWN IN THE UNSHADED CYCLES 


Cycle 





1 


2 


3 


4 


5 


6 


X BUS 


XWO 


XW1 






XWO 


XlVf 




XA REG 


XWO ,' 


XWO 


XWO 


XWO XWO 


XWO 


, XWO . 


XB REG 


XW1 


• xwt . 


XWl 


XWl XWl 


XW1 


XWl 


Y BUS 


YWO 


YW1 






vwo- 


ywt ■ 




YA REG 


YWO . 


YWO 


YWO 


YWO 


YWO 


YWO 


.YWO 


YB REG 


YW1 . 


YWl 


YW1 


YWl 


^Wl 


YWl 


vwt 


MPY OP 


XWt'YWI 


XWO-YWO 


XWl -YWO 


XWO'YWt 


XWl -YWl 


XWO-YWB 


XW1.YW0 


ACC OP 


S/A ' PASS 


S/A 


ACC 


S/A 


PASS 


S/A 


T REG 




PW3 


PWO 




PW3 




P BUS 


PW1 


PW3 


PW3 


PWO 1 PW1 


PW2 


PW3 


Note: MPY OP = Operation of multiplier array (X-Y) 
ACC OP = Operation of internal accumulator 
PASS = Pass tfirough multiplier product 
ACC = Add previous result 1o current product 
S/A = Shift previous result tfien add to current product 

TABLE 5. INSTRUCTION MICROCODE FOR 64 x 64-BIT SIGNED MULTIPLICATION WITH ONE 
COMPLETE EXTENDED MULTIPLICATION SHOWN IN THE UNSHADED CYCLES 


Cycle 





1 


2 


3 


4 


5 


6 


7 


8 


9 


A 


B 


C 


D 


ENXA 





1 


1 


1 





1 


1 


1 





1 


t 


1 


. 


1 


ENXB 


SMssS 





1 


1 


1 





1 


1 


1 





t 


1- 


1 





TCX 





1 





1 


" 


t 





1 





1 





1 





1 


XSEL 


1 





1 





. 1 - , 





■ 1 





' 





1 





. 1 


'1 


ENYA 





1 


1 


1 





1 


1 


1 





1 


1 


1 





1 


ENYB 


;*?'»?:••' 





1 


1 


1 





■V 


1 


1 





% 


1 


1 


c 


TCY 








1 


1 


D '. j 


1 


1 








T 


1 








YSEL 


1 


1 








1 


1 








1 


1 








' 


1 


ENl 

















■ 





" 

















;■ 


eF5T 


- ,1 


' 





1 


1 








1 


1 








1 


1 


c- 


TSEL 


X 


'1 





X 


X 


1 





^ 


X 


1 





X 


X 




AGCO 





1 


1 


1 





1 


1 


1 





1 


1 I 1 





' 


ACC1 





1 





1 





1 





1 





' 


1 t 





1 


ENP 


- p. 




















1 











i 


PSELO 


1 


1 








1 


1 


' 1 


1 








1 1 


PSEL1 











1 1 1 


( 











. 
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ABSOLUTE MAXIMUM RATIN 

Storage Temperature -65 


GS 

to +150°C Commt 
to -H25°C Ten- 
Sup 

' '° *7-° V Military 

^00 + 0.3 V ^^^ 
\/CC + 0.3 V ^ 

30 mA 

tn+inmA Operaft 


OPERATING RANGES 

ircial (C) Devices 

perature (Ta) tc 




Ambient Temperature Under Bias -55 


-t- 7n«p I 


Supply Voltage 


to Ground Potential 

-0.: 

Dlied to Outputs For 

State -0.3 to -i- 


ply Voltaqe OJcr) +A7H to -i-r??; v 1 


Continuous ■• 
DC- Voltage Ap 

High Output 
DC Input Voltac 
DC Output Curr 
DC Input Curre 


* (M) Devices 

perature (Ta) 

ply Voltage (Vex;) 


..-55 to +125°C 
.. + 4.5 to -^5.5 V 


ent, Into LOW Outputs 

It -in 


ng ranges define those limits between which the 


functionality ot ttie device is guaranteed. 
Stresses above those listed under ABSOLUTE MAXIMUM 

RA TINGS may cause permanent device failure. Functionality * Military Product 1 00% tested at Ta = + 25°C, + 1 25°C, and 
at or above ttiese limits is not implied. Exposure to absolute -55°C. 
maximum ratings for extended periods may affect device 
reliability. 

DC CHARACTERISTICS over operating range unless otherwise specified (for APL Products, Group A, 
Subgroups 1, 2, 3 are tested unless otherwise noted) 


Parameter 
Symbol 


Parameter 
Description 


Test Conditions (Note 1) 


Min. 


Max. 


Unit 


VOH 


Output HIGH Voltage 


Vcc = Min. 

V|N = V|H or V|L 

lOH = -0.4 mA , 


2.4 




V 


Vol 


Output LOW Voltage 


Vcc = Min., „-fc'"*'%J 
V|N = V,HOrV|L ^J'"^ ^ 
IOL = 4mA lii* 4 




0.5 


V 


V|H 


Input HIGH Level 


Guaranteed input logicsfcHiql'^itea^ for all 
inputs (Note 2) ,_ ;- fi ''W^' ' 


2.0 




V 


V|L 


Input LOW Level 


Guaranteed i^u^toiyWSw 
voltaae.ta ajVip^ ik)te 2) 




0.8 


V 


l|L 


Input LOW Current 


vcc'*M%,\:-'-- 

- V|N - ^4 * 




-10 


mA 


l|H 


Input HIGH Current •' . j* 


ftoe-Max, V|N = Vcc-0.5 V 




10 


mA 


lOZH 
'OZL 


Off State (HiglJrti^dilwi'^ 
Output Cu%nt ii, *f W ' 


Vcc = Max. 


Vq = 2.4 V 




10 


fiA 


Vo = 0.5 V 




-10 


Ice 


Static Power lUpply Cuaent 


Vcc = Max., 

V|N = Vcc or GND, 

Io = jiA 


COM'L 




25 


mA 


MIL 




25 


CPD 


Power Dissipation 
Capacitance 
(Note 3) 


Vcc = 5,0 V, 
Ta = 25°C, 
No Load 


3000 pF typical 


Notes: 1. Vqc conditions shown as Min. or Max., refer to the military or commercial Vcc limits. 

2. These input levels provide zero noise immunity and should only be statically tested in a noise-free environment (not 
functionally testS'd). 

3. CpD determines the no-load dynamic current consumption: 

Ice (Total) = Ice (Static) + Cpo Vcc <. where f is the switching frequency of the majority of the internal nodes, 
normally one-half of the clock frequency. This specification is not tested. 
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SWITCHING CHARACTERISTICS over COMMERCIAL operating range 


No. 


Parameter 
Symbol 


Parameter 
Description 


Test 
Conditions 


29C323 


29C323-1 


2dC323-2 


Unit 


Min. 


iUax. 


Min. 


Max. 


Min. 


Max. 


UNCLOCKED MODE | 


1 


tMUC 


Unclocked Multiply Time 
X0-X31, Yo-Yai to P0-P31 


FTX/Y/P - HIGH 




120 




100 




70 


ns 


2 


tMUCPP 


Unclocked Multiply Time 
X0-X31, Y0-Y31 to PP0-PP3 


FTX/Y/P = HIGH 




125 




105 




75 


ns 


3 


t|P 


Instruction to P0-P31 (Note 1) 


Output Taken From 
Adder FTI = HIGH 




120 




100 


'i- 


ns 


4 


t|PP 


Instruction to PP0-PP3 


Output Taken From 
Adder FTI = HIGH 




125 




105 


■■-i . 

:t-. 75 


ns 


CLOCKED MODE ■;■,.' | 


5 


tMC 


Clocked Multiply Time 


FTX/Y/P = LOW 




100 




80 


i&^ 


-"■■55 


ns 


6 


tpDP 


Clock to P0-P31 


Output Taken from 
Temp or Product Reg. 




38 




30 


r 


■-■25 


ns 


7 


tpDPP 


Clock to PP0-PP3 


Output Taken from 
Temp or Product Reg. 


Vfi. 


43 




35 




30 


ns 


8 


tPAP 


Clock to Po-Pai 


Output Taken from 
Adder, FTX/Y/I = LOW 


J** 


135 


^1 


115 


--=■■45 


>.80 


ns 


9 


tpAPP 


Clock to PP0-PP3 


Output Taken from 
Adder, FTX/Y/I = LOW 


t'S: 


r,ftt0 


%2 


•a«o 


t% 


. -.86 


ns 


10 


tSP 


Data to Product Register Setup 
Time 


FTX/Y = HIGH 


^M, 


t 


90'.', 


-' fe'-- 


eSi 


1 


ns 


11 


tHP 


Data to Product Register Hold 
Time 


FTX/Y = HIGH 


tiSi 


hS^ 


t' 


^^'"'= 


ft 




ns 


12 


tSIPT 


Instruction to Product Register 
Setup Time 


FTI = HIGH 


ilj^°, 




^ 


."tltv 






ns 


13 


tHIPT 


Instruction to Product Register 
Hold Time 


FTI = HIGH 


m 




°\4 


'f--'j\ 




ns 






14 


tpWH 


Clock Pulse Width HIGH 




2t' 




20"" 


^"- 


15 




ns 


15 


tpWL 


Clock Pulse Width LOW 




%» 




29 V- 


-''i 


tl 




ns 


SETUP AND HOLD TIMES •■ \ f' - 


16 


tSXY 


Register XA, XB, YA, YB Setup 
Time 












t5 ^ 


# 


ns 




•f^'^ 


17 


tHXY 


Register XA, XB, YA, YB Hold 
Time 






M; 


0I-. 


^"V-' 


o'j 




ns 


18 


tsi 


Instruction Register Setup Time 




18.,, 


h/^ 


1*.-. 


- v?= 


10 


fr- 


ns 


19 


tHI 


Instruction Register Hold Time 














0-. 




ns 


20 


tSEN 


Register Enable Setup Time 




18 




15 




to 


i:-y^ 


ns 


21 


tHEN 


Register Enable Hold Time 














!>-- 




ns 


22 


tSTS 


TSEL Setup Time 




18 




15 




10-^ 


-•-% 


ns 


23 


tHTS 


TSEL Hold Time 














Q.-.- 


SI 


ns 


COMMON PARAMETERS . . 


24 


tpp 


PSEL0-PSEL1 to P0-P31 


To Active State Only 




35 




30 


"■- 


•'•ms 


ns 


25 


tppp 


PSEL0-PSEL1 to PP0-PP3 


To Active State Only 




35 




30 




25 


ns 


26 


•OEPI 


OE to P0-P31. PP0-PP3 
Output Enable 






35 




30 




25 


ns 


27 


tOD 


SE or PSEL0-PSEL1 to 
P0-P31. PP0-PP3 Output 
Disable 






35 




30 




25 


ns 


28 


tDPE 


Data to PRERR 






35 




35 




30 


ns 


29 


tDHE 


Data to HDERR 


Slave = HIGH 




40 




40 




35 


ns 


Notes: 1. Instruction signals are XSEL, YSEl; TCX, ICY, ACCO, ACC1, and RND. 
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SWITCHING CHARACTERISTICS over MILITARY operating range (for APL Products, Group A, Subgroups 
9, 10, 11 are tested unless otherwise noted) 


No. 


Parameter 
Symbol 


Parameter 
Description 


Test 
Conditions 


29C323 


Unit 


Min. 


Max. 


UNCLOCKED MODE | 


1 


tMUC 


Unclocked Multiply Time 
X0-X31, Y0-Y31 to Pa-P3i 


FTX/Y/P - HIGH 




140 


ns 


2 


'MUCPP 


Unclocked Multiply Time 
X0-X31, Y0-Y31 to PPo-PPs 


FTX/Y/P = HIGH 




145 


ns 


3 


t|P 


Instmction to P0-P31 (Note 1) 


Output Taken From Adder 
FTI - HIGH 




140 


ns 


4 


t|PP 


Instruction to PP0-PP3 


Output Taken From Adder 
FTI = HIGH 




145 


ns 


CLOCKED MODE | 


5 


two 


Clocked Multiply Time 


FTX/Y/P = LOW 




120 


ns 


6 


tpDP 


Clock to Po - P31 


Output Taken frapi Temp or 
Product Reg. ^ % 




45 


ns 


7 


tpDPP 


Clock to PPo-PPs 


Output Tak|n"ifWtomp or 

Product _.E|fii|; "^ 




50 


ns 


8 


tPAP 


Clock to P0-P31 


Output 'fli* ?iWi Adder. 




150 


ns 


9 


tPAPP 


Clock to PPo - PPa 


putpk iPih from Adder, 
TO/l-LOW 




155 


ns 


10 


tSP 


Data to Product Register Setup Time ^ <! 


jj|»api||^- HIGH 


135 




ns 


11 


tHP 


Data to Product Register Hold Time &; tt 


l|T%</Y - HIGH 







ns 


12 


tSIPT 


Instruction to Product Heg Setup JifnS-^ 


J^l - HIGH 


135 




ns 


13 


tHIPT 


Instruction to Product Reg HoldtT^iw|k'ttiii 


FTI - HIGH 







ns 


i 1* 


tpWH 


Clock Pulse Widtti HIGH %u * 




20 




ns 


15 


tpWL 


Clock Pulse Width LOW „ %„^ „T, 




20 




ns 


SETUP AND HOLD TIMES , f ,^ "W | 


16 


tSXY 


Register XA, XB, Y|teyll|te|ip"Time 




24 


ns 


17 


tHXY 


Register XA, XBi'»,,jl•^jt^old Time 









ns 


18 


tsi 


Instruction Rjt*%'SlKMP Time 




20 




ns 


19 


tHI 


Instruction ffilbi|J6r Hold Time 









ns 


20 


tSEN 


Register Enable^ltetup Time 




20 




ns 


21 


tHEN 


Register Enable Hold Time 









ns 


22 


tSTS 


TSEL Setup Time 




20 




ns 


23 


•hts 


TSEL Hold Time 









ns 


COMMON PARAMETERS | 


24 


tpp 


PSEL0-PSEL1 to Po-Psi 


To Active State Only 




40 


ns 


25 


tppp 


PSEL0-PSEL1 to PP0-PP3 


To Active State Only 




40 


ns 


26 


tOEP1 


SE to P0-P31, PP0-PP3 Output Enable 






40 


ns 


27 


too 


CE or PSEL0-PSEL1 to P0-P31, 
PP0-PP3 Output Disable 






40 


ns 


28 


'dpe 


Data to PHERR 






40 


ns 


29 


tDHE 


Data to HDERR 


Slave - HIGH 




45 


ns 


Notes; 1. Instruction signals are XSEL, YSEL, TCX, TCY, ACCO, ACC1, and RND. 
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SWITCHING TEST CIRCUITS 



VOUT 




VoutI 



Rj I 5.1 K > ~ Cl 



H<- 



TC001082 



A. Three-State Outputs 



B. Normal Outputs 



Notes: 1 . Cl = 50 PF includes scope probe, wiring and stray capacitances witliout device in test fixture. 

2. Si, S2, S3 are closed during function tests and all AC tests except output enable tests. 

3. Si and S3 are closed while S2 is open for tpzH tsst. 
Si and S2 are closed while S3 is open for tpzL test. 

4. Cl = TBD for output disable tests. 



SWITCHING WAVEFORMS 
KEY TO SWITCHING WAVEFORMS 



M 



WILL BE 
CHANGING 
FROM H TO L 



WILL BE 
CHANGING 
FROM L TO Hi 



^M. 



DON'T CARB; CHANGING; 

ANY CHANGE STATE 

PERMITTED UNKNOWN 



CENTER 
DOES NOT LINE IS HIGH 

APPLY IMPEDANCE 

"OFF" STATE 



KS000010 
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Xq - ^3^ ■ 

enxa. enxb, @" 
enya, enyb, 

ENI 



INST 



ENP. ENT 



TSEL 



Pq - P31 



PPO - PP3 



SWITCHING WAVEFORMS (Cont'd.) 

-4 © ►l^ © H 



■ * » ■ 



4->- 



W »•♦♦• 



■* ^ 



-♦> 



-© 



\ 



-<5>- 



@- 



Clocked Operation: FTX, Y, P, I = LOW 



r 




-@ 



■*—¥■ 



-© 



xz 



'Hj> 



.A. 
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CLK 



Xo - Xsi 
Vo - Y31 



enxa, enxb 
enya, enyb 
en! 



INST 



Po - P31 



PPO - PP3 



©- 



SWITCHING WAVEFORMS (Cont'd.) 

@ ►N © »• 



y 



^ ► 



^ » 



A » 



<♦> 



\ 




-% 



-®- 




Clocked Operation: Output Taken from Adder 
(FTX, Y, I = LOW; FTP = HIGH; PSEL1 ^ PSELO) 



WF022960 
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CLK 



SWITCHING WAVEFORMS (Cont'd.) 

M ►j^ ©- 



r 



\ 



/ 



'< — ►- 



-© 



•< — ► — 'B 



-© 



TSEL 



\?i h* ► 



•*-»- 



Po - P31 



■* ►- 



-® 



X 



■* ►■ 



-Q> 






Clocked Operation: Input Registers Bypassed 
(FTX, Y, I = HIGH; FTP = LOW) 



4-20 



Xfl — ''si • 
Yo - Y31 



INST 



Po - P31 



PPq - PP3 



SWITCHING WAVEFORMS (Cont'd.) 
© 



X 



X 



-©- 



-©- 



-0- 



WF022990 



Unclocked Mode: FTX, Y, I, P = HIGH 



PSELO - PSEL1 



OE 



Po - P31 



fQ - f-f-a 



7V 



XPSEL1 =H \/ 
_PSELO=H_yS^ 



-♦-♦i — m 



•4-*\ 



■4 — ► 



'M 



J 



|-@ 



jr 



•*—¥■ 



€ 






-«-► 



£ 



-© 



"M 



1 



2 



^♦-@-» | 



^ 



M 



Output Select Timing 
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SWITCHING WAVEFORMS (Cont'd.) 



X0-X31.PX0-PX3 

Yo ■ Yai . PYo ■ PY3 

PRERR 



)( 



< ► 




C28) 



PRERR Timing 



WF023013 



Pq ■ Pai • pPq - PP3 



HDERR 



)( 




-(g) 



Slave Mode Timing 



WF0230Z4 



:a .^ 



^ f 



INPUT 
_ TO . 
OUTPUT 
TO PELAV 

"OUTPUT 
DELAY 



CLOCK 



-22 



INPUT/OUTPUT CURRENT rNTERFACE DIAGRAMS 



v«- 




DRIVEN INPUT 






III 










) 






/-«, 








ik* 




l,H 


^ 

' 


1 








' 

















Vcc 



1 



C| «= 5.0 pF, all inputs 



IC000870 

Co '^ 5.0 pF, all outputs 
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Am29325 

32-Bit Floating-Point Processor 



n 



DISTINCTIVE CHARACTERISTICS 



Single VLSI device performs high-speed floating-point 
arithmetic 

- Floating-point addition, subtraction, and multiplication 
in a single clock cycle 

- Internal architecture supports sum-of-products, 
Newton-Raphson division 

32-t>it, three-bus flow-through architecture 

- Programmable I/O allows interface to 32- and 16-bit 
systems 



• IEEE and DEC formats 

- Performs conversions between formats 

- Performs integer <-»■ floating-point conversions 

• Six flags indicate operation status 

• Register enables eliminate clock skew 

• Input and output registers can be made transparent 
independently 



GENERAL DESCRIPTION 



Tfie Am29325 is a high-speed floating-point processor unit. 
It performs 32-blt single-precision floating-point addition, 
subtraction, and multiplication operations in a single VLSI 
circuit, using the format specified by the proposed IEEE 
floating-point standard, P754. The DEC single-precision 
floating-point format is also supported. Operations for 
conversion between 32-bit integer format and floating-point 
format are available, as are operations for converting 
tietween the IEEE and DEC floating-point formats. Any 
operation can be performed in a single clock cycle. Six 
flags — invalid operation, inexact result, zero, not-a-num- 
ber, overflow, and underflow — monitor the status of opera- 
tions. 

The Am29325 has a three-bus, 32-bit architecture, with two 
input buses and one output bus. This configuration provides 



high I/O bandwidth, allows access to all buses and affords 
a high degree of flexibility when connecting this device in a 
system. All buses are registered with each register having a 
clock enable. Input and output registers may be made 
transparent independently. Two other I/O configurations, a 
32-bit, two-bus architecture and a 16-bit, three-bus archi- 
tecture, are user-selectable, easirig interface with a wide 
variety of systems. Thirty-two-bit internal feedforward data- 
paths support accumulation operattons, including sum-of- 
products and Newton-Raphson division. 

Fabricated with the high-speed IMOX™ bipolar process, 
the Am29325 is powered by a single 5-volt supply. The 
device is housed in a 145-terminal pin-grid-array package. 



Am29300 FAMILY HIGH-PERFORMANCE SYSTEM BLOCK DIAGRAM 



Ain29331 

1S-BIT 

SEQUENCER 



MICROPROGRAM 
MEMORV 



PIPELINE 
REGISTER 



CONTROL 
SIGNALS 

























n 


















Am2«334 

REGISTER 

FILE 

64x18 










—* 














^.' 










'■ 1 


/ 










^ 1 


32, 
































,-.■', '-..11., 








Am29332 

32-BIT 

ALU 




I., •ainic , ,,■-,' 

.rSSSSSSr::; 




Affl2S323 

32x32 

PARALLEL 

MULTIPUER 






32, 
















/ 























AF004e50 
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Publication # QsSL Amendmenl 

as«2i D /O 
Issue Date: April 1887 



RELATED AMD PRODUCTS 


Part No. 


Description 


Am29114 


Vectored Priority Interrupt Controller 


Am29116 


High-Performance Bipolar 16-Bit Microprocessor 


Am29C116 


High-Performance CMOS 16-Bit Microprocessor 


Am29PL141 


Fuse Programmable Controller 


Am29C323 


CMOS 32-Bit Parallel Multiplier 


Am29331 


16-Bit Microprogram Sequencer 


Am29C331 


CMOS 16-Bit Microprogram Sequencer 


Am29332 


32-Bit Extended Function ALU 


Am29C332 


CMOS 32-Bit Extended Function ALU 


Am29334 


64x18 Four-Port, Dual-Access Register File 


Am29C334 


CMOS 64x18 Four-Port, Dual-Access Register File 


Am29337 


16-Bit Bounds Checker 


Am29338 


Byte Queue 



clkO— /- 

SELECT 16 

AND ENABLE [Z> / 
UNES 



2:1 
MUX 



BLOCK DIAGRAM 

"o-Rai So-Sj, 

L 7. 



REGISTER 
S 



REGISTER 

R 



2 : 1 
MUX 



FLOATING-POINT 
ALU 



REGISTER 

F 



oeO- 



STATUS 

FLAG 

GENERATOR 



STATUS FLAG 
REGISTER 



1 




BD007080 
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CONNECTION DIAGRAM 
Top View 

PGA 





A 


B 


C 


D 


E 


F 


G 


H 


J 


K 


L 


M 


N 


P 


R 


1 


INEX 


12 


h 


ENF 


14 


OBUS 


5e 


VCCE 


OLK 


R31 


R30 


H25 


R24 


R21 


R20 


2 


INVA 


NAN 


K) 


l/D 


FTO 


FTl 


VOCE 


VCCE 


RNDO 


RND1 


R27 


R2S 


R23 


R22 


R17 


3 


F29 


ZERO 


GI1DT 


ffm 


ENS 


16^ 


VCCE 


VCCE 


VCCE 


R29 


R2e 


GNDE 


GNDE 


R19 


R18 


4 


F30 


F31 


GNDT 


* 


















R1S 


R16 


R13 


S 


F23 


OVFL 


UNFL 




















R14 


R11 


R12 


6 


F26 


F27 


F28 




















R9 


RIO 


R7 


7 


F21 


F24 


F25 




















RS 


R5 


R6 


a 


F22 


F19 


VCCT 




















R3 


R4 


R1 


9 


F17 


F20 


VCCT 




















RO 


13 


R2 


ID 


F18 


F15 


F16 




















S28 


331 


330 


11 


F13 


F14 


F11 




















S27 


S26 


S29 


12 


F12 


F9 


F10 




















VCCE 


325 


324 


13 


R 


F6 


GNDT 


GNDT 


GNDT 


GNDT 


6NDE 


GNDE 


GNDE 


S8 


SI 3 


314 


VCCE 


322 


323 


14 


F8 


F3 


F2 


GNDT 


FO 


S1 


S2 


GNDE 


34 


S9 


S10 


SIS 


SI 8 


S21 


320 


IS 


F5 


F4 


F1 


GNDT 


P/AFF 


SO 


S3 


S5 


S7 


S6 


511 


312 


317 


316 


319 



CD010490 



Key: 



16/32 = 

GNDE = 

GNDT = 

l/D- 

INEX = 

INVA = 
OBUS = 

OVFL = 
P/AFF = 

UNFL = 

vcx;e = 

VCCT = 



SI 6/32 
Ground, ECL 
Grou nd, TT L 
IEEE/DEC 
INEXACT 
INVALID 
ONEBUS 
OVERFLOW 
PROJ/AFF 
UNDERFLOW 
Vcc. ECL 
Vcc. TTL 



*D4 is an alignment pin (not connected internally). 
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PIN DESIGNATIONS 

(Sorted by Pin No.) 


PIN NO. 


PIN NAME 


PIN NO. 


PIN NAME 


PIN NO. 


PIN NAME 


PIN NO. 


PIN NAME 


A-1 


Inexact 


C-7 


F25 


H-13 


GNDE 


N-10 


S28 


A-2 


Invalid 


C-8 


VCCT 


H-14 


GNDE 


N-11 


S27 


A-3 


F29 


C-9 


VcCT 


H-15 


S5 


N-12 


VcCE 


A-4 


F30 


C-10 


F16 


J-1 


CLK 


N-13 


VcCE 


A-S 


F23 


C-11 


F11 


J-2 


RNDo 


N-14 


S18 


A-6 


F26 


C-12 


F10 


J-3 


VcCE 


N-1 5 


Sl7 


A-7 


F21 


C-13 


GNDT 


J-13 


GNDE 


P-1 


R21 


A-8 


F22 


C-14 


F2 


J-14 


S4 


P-2 


R22 


A-9 


Fl7 


C-15 


F1 


J-15 


S7 


P-3 


R19 


A-10 


F18 


D-1 


ENF 


K-1 


R31 


P-4 


R16 


A-11 


Fl3 


D-2 


IEEE/DEC 


K-2 


RNDi 


P-5 


R11 


A-12 


F12 


D-3 


ENR 


K-3 


R29 


P-6 


R10 


A-13 


F7 


D-13 


GNDT 


K-1 3 


Se 


P-7 


R5 


A-14 


Fe 


D-1 4 


GNDT 


K-14 


Sg 


P-8 


R4 


A-15 


F5 


D-15 


GNDT 


K-15 


Se 


P-9 


I3 


B-1 


I2 


E-1 


I4 


L-1 


F!30 


P-10 


S31 


B-2 


NAN 


E-2 


FTo 


L-2 


R27 


P-11 


S26 


B-3 


ZERO 


E-3 


ENS 


L-3 


R26 


P-1 2 


S25 


B-4 


F31 


E-13 


GNDT 


L-13 


Sl3 


P-1 3 


S22 


B-5 


OVERFLOW 


E-14 


Fo 


L-14 


S10 


P-14 


S21 


B-6 


F27 


E-15 


PROJ/AFF 


L-1 5 


S11 


P-1 5 


S16 


B-7 


F24 


F-1 


ONEBUS 


M-1 


R25 


R-1 


R20 


B-8 


Fl9 


F-2 


FTi 


M-2 


R28 


R-2 


Rl7 


B-9 


F20 


F-3 


SI 6/32 


M-3 


GNDE 


R-3 


R18 


B-10 


Fl5 


F-13 


GNDT 


M-1 3 


Si4 


R-4 


Rl3 


B-11 


Fl4 


F-1 4 


S1 


M-14 


S15 


R-5 


R12 


B-1 2 


F9 


F-1 5 


So 


M-15 


S12 


R-6 


R7 


B-1 3 


Fe 


G-1 


Oe 


N-1 


R24 


R-7 


Re 


B-14 


F3 


G-2 


VCCE* 


N-2 


R23 


R-8 


Ri 


B-1S 


F4 


G-3 


VCCE 


N-3 


GNDE 


R-9 


Ra 


C-1 


I1 


G-13 


GNDE 


N-4 


Rl5 


R-10 


S30 


C-2 


lo 


G-14 


S2 


N-5 


Rl4 


R-11 


S29 


C-3 


GNDT* 


G-15 


S3 


N-6 


R9 


R-12 


S24 


C-4 


GNDT 


H-1 


VcCE 


N-7 


Ra 


R-13 


S23 


C-5 


UNDERFLOW 


H-2 


VcCE 


N-8 


R3 


R-14 


S20 


C-6 


F28 


H-3 


VcCE 


N-9 


Ro 


R-15 


Sl9 


*T and E represent TTL and ECL, respectively. 
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PIN DESIGNATIONS (Cont'd.) 
(Sorted by Pin Name) 


PIN NO. 


PIN NAME 


PIN NO. 


PIN NAME. 


PIN NO. 


PIN NAME 


PIN NO. 


PIN NAME. 


J-1 


CLK 


E-2 


FTq 


R-6 


R7 


K-1 4 


S9 


D-1 


INF 


F-2 


FTi 


N-7 


Rs 


L-14 


S10 


D-3 


ENR 


N-3 


GNDE* 


N-6 


R9 


L-1 5 


S11 


E-3 


ERS 


H-14 


GNDE 


P-6 


Rio 


M-1 5 


S12 


E-14 


Fo 


G-13 


GNDE 


P-5 


R11 


L-1 3 


S13 


C-15 


Fl 


M-3 


GNDE 


R-5 


R12 


M-1 3 


Sl4 


C-14 


F2 


H-13 


GNDE 


R-4 


Rl3 


M-1 4 


Sl5 


B-14 


Fs 


J-1 3 


GNDE 


N-5 


Rl4 


P-1 5 


S16 


B-15 


F4 


D-1 5 


GNDT 


N-4 


Rl5 


F-3 


S16/32 


A-15 


Fs 


13-14 


GNDT 


P-4 


R16 


N-15 


S17 


B-13 


Fe 


E-13 


GNDT 


R-2 


R17 


N-14 


S18 


A-13 


F7 


F-13 


GNDT 


R-3 


R18 


R-1 5 


Sl9 


A-14 


Fs 


C-4 


GNDT 


P-3 


Rl9 


R-1 4 


S20 


B-12 


F9 


C-3 


GNDT 


R-1 


R20 


P-1 4 


S21 


C-12 


FlO 


D-1 3 


GNDT 


P-1 


R2I 


P-1 3 


S22 


C-11 


Fl1 


C-13 


GNDT 


P-2 


R22 


R-1 3 


S23 


A-12 


Fl2 


C-2 


io 


N-2 


R23 


R-1 2 


S24 


A-11 


Fl3 


C-1 


h 


N-1 


R24 


P-1 2 


S2S 


B-11 


Fl4 


B-1 


I2 


M-1 


R25 


P-11 


S26 


B-10 


Fl5 


P-9 


I3 


L-3 


R26 


N-11 


S27 


C-10 


Fl6 


E-1 


I4 


L-2 


R27 


N-10 


S28 


A-9 


Fl7 


D-2 


IEEE/DEC 


M-2 


R28 


R-11 


S29 


A-10 


Fie 


A-1 


INEXACT 


K-3 


R29 


R-10 


S30 


B-8 


Fl9 


A-2 


INVALID 


L-1 


R30 


P-10 


S31 


B-9 


F20 


B-2 


NAN 


K-1 


R3I 


C-5 


UNDERFLOW 


A-7 


F21 


G-1 


OE 


J-2 


RNDo 


J-3 


VcCE 


A-8 


F22 


F-1 


ONEBUS 


K-2 


RNDi 


G-2 


VCCE 


A-5 


F23 


B-5 


OVERFLOW 


F-1 5 


So 


G-3 


VcCE 


B-7 


F24 


E-1 5 


PROJ/AFF 


F-1 4 


Si 


H-2 


VcCE 


C-7 


F2S 


N-9 


Ro 


G-1 4 


S2 


N-1 3 


VcCE 


A-6 


F26 


R-8 


Ri 


G-1 5 


S3 


N-1 2 


VcCE 


B-6 


F27 


R-9 


R2 


J-14 


S4 


H-3 


VcCE 


C-6 


F28 


N-8 


R3 


H-15 


S5 


H-1 


VcCE 


A-3 


F29 


P-8 


R4 


K-1 5 


So 


C-8 


VCCT 


A-4 


F30 


P-7 


R5 


J-1 5 


S7 


G-9 


VCCT 


B-4 


F3I 


R-7 


Re 


K-1 3 


Ss 


B-3 


ZERO 


*E and T represent ECL and TTL, respectively. 
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LOGIC SYMBOL 






s^ 



5^ 



^ 



%"31 

CLK 
ENR 
ENS 
EFiF 
FTo,FTi 

lo-U 

IEEE/DEC 
OE 

ONEBUS 
PROJ/AFF 
RNDo,RNDi 
S16^ 



^0"'^31 



2C> 



INEXACT 

INVALID 

NAN 

OVERFLOW 

UNDERFLOW 

ZERO 



METALLIZATION AND PAD LAYOUT 
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ORDERING INFORMATION 
Standard Products 



AMD standard products are available in several packages and operating ranges. The order number (Valid Ckjmblnatlon) Is 
formed by a combination of: a. Device Number 

b. Speed Option (if applicable) 

c. Package Type 

d. Temperature Range 

e. Optionai Processing 



AM2932S 



J. 



-a. DEVICE NUMBER/DESCRIPTION 

Am2932S 

32-Btt Floating-Point Processor 



-e. OPTIONAL PROCESSING 

Blank = Standard processing 
B = Burn-in 

-d. TEMPERATURE RANGE 

C = Commercial (0 to ■^ 85°C) Case 

- c. PACKAGE TYPE 

G = 145-Terminal Pin Grid Array (CG 145) 

b. SPEED OPTION 

Not Applk»ble 



Valid Combinations 



Am29326 



GC, GCB 



Valid Combinations 

Valid Combinations list configurations planned to be 
supported In volume for this device. Consult the local AMD 
sales office to confirm availability of specific valid 
combinations, to check on newly released combinations, and 
to obtain additional data on AMD's standard military grade 
products. 
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PIN DESCRIPTION 



R0-R31 R Operand Bus (Input) 

Ro is the least-significant bit 

So -831 S Operand Bus (Input) 

So is the least-significant bit. 

F0-F31 F Operand Bus (Output) 

Fo is the least-significant bit. 

CLK Clock (Input) 

For the internal registers. 



ENR Regis ter R Clock Enable (Inpu^ Active LOW) 

When ENR is LOW, register R i s cloc ked on the LOW-to- 
HIGH transition of CLK. When ENR is HIGH, register R 
retains the previous contents. 



ENS Register S Clock Enable (Input; Active LOW) 

When ENS is LOW, register 8 i s cloc ked On the LOW-to- 
HIGH transition of CLK. When ENS is HIGH, register S 
retains the previous contents. 



ENF Register F Clock Enable (Input; Acthre LOW) 

When ENF is LOW, register F i s cloc ked on the LOW-to- 
HIGH transition of CLK. When ENF is HIGH, register F 
retains the previous contents. 



OE Output Enable (Input; Active LOW) 

When SE is LOW, the contents of register F are placed on 
F0-F31. When OE is HIGH, F0-F31 assume a high- 
impedance state. 

ONEBUS Input Bus Configuration Control (Input) 

A LOW on ONEBUS configures the input bus circuitry for 
two-input bus operation. A HIGH on ONEBUS configures 
the input bus circuitry for single-input bus operation. 

FTo Input Register Feedthrough Control (Input; 
Active HIGH) 

When FTo is HIGH, registers R and S are transparent. 

FTt Output Register Feedthrough Control (Input; 
Active HIGH) 

When FTi is HIGH, register F and the status flag register 
are transparent 

I0-I2 Operation Select Lines (Input) 

Used to select the operation to be performed by the ALU. 
See Table 1 for a list of operations and the corresponding 
codes. 

I3 ALU S Port Input Select (Input) 

A LOW on I3 selects register S as the input to the ALU S 
port. A HIGH on I3 selects register F as the input to the ALU 
S port. 



I4 Register R Input Select (Input) 

A LOW on I4 selects Rq - R31 as the input to register R. A 
HIGH selects the ALU F port as the input to register R. 

IEEE/DEC IEEE/DEC Mode Select (Input) 

When IEEE / DEC is HIGH, IEEE mode is selected. When 
IEEE/DEC is LOW, DEC mode is selected. 

S16/32 16- or ^Bit I/O Mode Select (Input) 

A LOW on SI 6/32 selects the 32-bit I/O mode; a HIGH 
selects the 16-bit I/O mode. In 32-bit mode, input and 
output buses are 32 bits wide. In 16-bit mode, input and 
output buses are 16 bits wide, with the least- and most- 
significant portions of the 32-bit input and output words 
being placed on the buses during the HIGH and LOW 
portions of CLK, respectively. 

RNDo, RNDi Rounding Mode Selects (Input) 

RNDo and RNDi select one of four rounding modes. See 
Table 5 for a list of rounding modes and the corresponding 
control codes. 



PROJ/AFF Projecth/e/Affine Mode Select (Input) 

Choice of projective or affine mode determines the way in 
which infin ities are handled in IEEE mode. A LOW on 
PROJ/AFF selects affine mode; a HIGH selects projective 
mode. 

OVERFLOW Overflow Flag (Output; Active HIGH) 

A HIGH indicates that the last operation produced a final 
result that overflowed the floating-point format 

UNDERFLOW Underflow Flag (Output; Active HIGH) 

A HIGH indicates that the last operation produced a 
rounded result that underflowed the floating-point format 

ZERO Zero Flag (Output; Active HIGH) 

A HIGH indicates that the last operation produced a final 
result of zero. 

NAN Not-a-Number Flag (Output; Active HIGH) 

A HIGH indicates that the final result produced by the last 
operation is not to be interpreted as a number. The output in 
such cases is either an IEEE Not-a-Number (NAN) or a 
DEC-reserved operand. 

INVALID Invalid Operation Flag (Output; Active 
HIGH) 

A HIGH indicates that the last operation performed was 
invalid; e.g., <» times 0. 

INEXACT Inexact Result Flag (Output; Active HIGH) 

A HIGH indicates that the final result of the last operation 
was not infinitely precise, due to rounding. 



Definition of Terms 

Affine Mode 

One of two modes affecting the handling of operations on 
infinities — see the Operations with Infinities section under 
Operations in IEEE Mode. 

Biased Exponent 

The true exponent of a floating-point number, plus a constant 
For IEEE floating-point numbers, the constant is 127; for DEC 
floating-point numbers, the constant is 128. See also True 
Exponent 

Bus 

Data input or output channel for the floating-point processor. 



DEC-Reserved Operand 

A DEC floating-point number that is interpreted as a symbol 
and has no numeric value. A DEC-reserved operand has a 
sign of 1 and a biased exponent of 0. 

Destination Format 

The format of the final result produced by the floating-point 
ALU. The destination format can be IEEE floating point, DEC 
floating point or integer. 

Final Result 

The result produced by the floating-point ALU. 

Fraction 

The 23 least-significant bits of the mantissa. 
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Infinitely Precise Result 



The result that would be obtained from an operation if both 
exponent range and precision were unlxtunded. 

Input Operands 

The value or values on which an operation is performed. For 
example, the addition 2 + 3-5 has input operands 2 and 3. 

Mantissa 

The portion of a floating-point number containing the number's 
signiflcantbits. Forthe floating-point number 1.101 x 2"^, the 
mantissa Is 1.101. 

NAN (Not-a-Number) 

An IEEE floating-point number that Is interpreted as a symtwl, 
and has no numeric value. A NAN has a biased exponent of 
255io and a non-zero fraction. 

Port 

Data input or output channel for the floating-point ALU. 

Projective Mode 

One of two modes affecting the handling of operations on 
infinities — see the Operations with Infinities section under 
Operation In IEEE Mode. 

Rounded Result 

The result produced by rounding the infinitely precise result to 
fit the destination format. 

True Exponent (or Exponent) 

Number representing the power of two by which a floating- 
point number's mantissa is to be multiplied. For the floating- 
point number 1.101 x2"^, the true exponent is -3. 

FUNCTIONAL DESCRIPTION 

Architecture 

The Am29325 comprises a high-speed, floating-point ALU, a 
status flag generator, and a 32-blt data path. 

Floating-Point ALU 

The floating-point ALU perfomns 32-bit floating-point opera- 
tions. It also performs floating-point-to-integer conversions, 
integer-to-floating-point floating-point conversions, and con- 
versions between the IEEE and DEC formats. The ALU has 
two 32-bit input ports, R and S, and a 32-bit output port, F. 

Conceptually, the process performed by the ALU can be 
divided into three stages (see Figure 1). The operation stage 
performs the arithmetic operation selected by the user; the 
output of this section is referred to as the infinitely precise 
result of the operation. The rounding stage rounds the 
infinitely precise result to fit in the destination format; the 
output of this stage is called the rounded result. The last stage 
checks for exceptional conditions. If no exceptionai condition 
is found, the rounded result is passed through this stage. If 
some exceptionai condition Is found (e.g., overflow, underflow, 
or an invalid operation), this section may replace the rounded 
result with another output, such as + ■», -"o, a NAN, or a DEC- 



reserved operand. The output of this last stage appears on 
port F, and is called the final result. 



OPERATION STAGE 
(PERFORMS SELECTED OPERATION) 



- INFINrrELY PRECISE RESULT 



ROUNDING STAGE 

(ROUNDS INFINITELY PRECISE 

RESULT) 



- ROUNDED RESULT 



EXCEPTION STAGE 
(CHECKS FOR UNUSUAL CONDITIONS) 



FINAL RESULT 



AF004540 



Figure 1. Conceptual Model of the Process 
Performed by the Floating-Point ALU 

The ALU performs one of eight operations; the operation to be 
performed Is selected by placing the appropriate control code 
on lines Iq - Ij. Table 1 gives the control codes corresponding 
to each of the eight operations. 

The floating-point addition operation (R PLUS S) adds the 
floating-point numbers on ports R and S, and places the 
floating-point result on port F. In IEEE mode (IEEE/ 
DEC - HIGH) the addition is p erform ed in IEEE floating-point 
format; in DEC mode (IEEE/DEC = LOW) the addition is 
performed in DEC format. 

The floating-point subtraction operation (R MINUS S) sub- 
tracts the floating-point number on port S from the floating- 
point number on port R and p laces the floating-point result on 
port F. In IEEE mode (IEEE/DEC = HIGH) the subtraction is 
perfor ni«l in IEEE floating-point point format; in DEC mode 
(IEEE/DEC - LOW) the subtraction is performed in DEC 
format. 

The floating-point multiplication operation (R TIMES S) multi- 
plies the floating-point numtiers on ports R and S, and places 
the fl oating-point result on port F. In IEEE mode (IEEE/ 
DEC = HIGH) the multiplication is perfo rmed in IEEE floating- 
point format; in DEC mode (IEEE/DEC - LOW) the multiplica- 
tion is performed in DEC format. 

The floating-point constant subtraction (2 MINUS S) operation 
subti'acts the floating-point value on port S from 2, and places 
the result on port F. The operand on port R is not used In this 
operation; its valu e wll not affect the operation in any way, In 
IEEE mode (IEEE/DEC - HIGH) the operation is perfo rmed in 
IEEE floating-point format; in DEC mode (IEEE/0EC - LOW) 
the operation is performed In DEC format. This operation is 
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used to support Newton-Raphson floating-point division; a 
description of its use appears in Appendix C. 

Tfie integer-to-floating-point conversion (INT-TO-FP) opera- 
tion taltes a 32-bit, two's-complement Integer on port R and 
places tfie equivalent floating-point value on port F. The 



operand on port S is not used in tfiis operation; its value will 
not a ffect tfie operation In any way. In IEEE mode (IEEE/ 
DEC - HIGH) ttie result is delivered in IEEE format; in DEC 
mode (IEEE/DEC = LOW) the result Is delivered in DEC 
format. 



TABLE 1. ALU OPERATION SELECT 



■z 


h 


lo 


Operation 


Output Equation 











Floating-point addition (R PLUS S) 


F = R-hS 








1 


Floating-point subtraction (R MINUS S) 


F = R-S 





1 





Floating-point multiplication (R TIMES 8) 


F = R*S 





1 


1 


Floating-point constant subtraction 
(2 MINUS S) 


F = 2-S 


1 








Integer-to-floating-point conversion 
(INT-TO-FP) 


F (floating-point) = R (integer) 


1 





1 


Floatlng-point-to-integer conversion 
(FP-TO-INT) 


F (integer) = R (floating-point) 


1 


1 





IEEE-TO-DEC format conversion 
(lEEE-TO-DEC) 


F (DEC format) = R (IEEE format) 


1 


1 


1 


DEC-TO-IEEE format conversion 
(DEC-TO-IEEE) 


F (IEEE format) = R (DEC format) 



The floating-point-to-lnteger conversion (FP-TO-INT) opera- 
tion takes a floating-point number on port R and places the 
equivalent 32-blt, two's-complement Integer value on port F. 
The operand on port S Is not used in this operation; its value 
will n ot affect the operation in any way. In IEEE mode (IEEE/ 
DEC = HIGH) the operand on port R Is interpre ted u sing the 
IEEE floating-point format; in DEC mode (IEEE/DEC = LOW) 
it is Interpreted using the DEC floating-point format. 

The lEEE-to-DEC conversion operation (IEEE-TO-DEC) takes 
an IEEE-format floating-point number on port R and places the 
equivalent DEC-format floating-point number on port F. The 
operand on port S Is not used in this operation; its value will 
not affect the operation in any way. The operation can be 
performed in either IEEE mode (IEEE/DEC = HIGH) or DEC 
mode (IEEE/DE5 = LOW). 

The DEC-to-IEEE conversion operation (DEC-TO-IEEE) takes 
a DEC-format floating-point number on port R and places the 
equivalent IEEE-floating-point number on port F. The operand 
on port S is not used In this operation; Its value will not affect 
the operation in any way. The operation can be performed In 
either IEEE mode (IEEE/DEC = HIGH) or DEC mode (IEEE/ 
DEC = LOW). 

Status Flag Generator 

The status flag generator controls the state of six flags that 
report the status of floating-point ALU operations. The flags 
indicate when an operation is invalid (e.g., °° times 0) or when 
an operation has produced an overflow, an underflow, a non- 
numerical result (e.g., a NAN- or DEC-resen/ed operand), an 
Inexact result, or a result of zero. The flags represent the 
status of the most recently performed operation. Flag status is 
stored in the flag status register on the LOW-to-HIGH transi- 
tion of CLK. When the output register feedthrough control FTi 
is HIGH, the flag status register is made transparent. 



Data Path 

The 32-bit data path consists of the R and S input buses; the F 
output bus; data registers R, S, and F; the register R input 
multiplexer; and the ALU port S Input multiplexer. 

Input operands enter the floating-point processor through the 
32-bit R and S Input buses, Rq - R31 and So - S31 . Results of 
operations appear on the 32-blt F bus, F0-F31. The F bus 
assumes a high-Impedance state when output enable SI is 
HIGH. 

The R and S registers store input operands; the F register 
stores the final result of the floating-point ALU operation. Each 
regis ter has an Independent clock enable (ENR, ENS, and 
ENF). When a register's clock enable Is LOW, the register 
stores the data on its input at the LOW-to-HIGH transition of 
CLK; when the clock enable Is HIGH, the register retains its 
current data. All data registers are fully edge-triggered — both 
the input data and the register enable need only meet modest 
setup and hold time requirements. Registers R and S can t>e 
made transparent by setting FTq, the input register feed- 
through control, HIGH. Register F can be made transparent by 
setting FTi, the output register feedthrough control, HIGH. 

The register R input multiplexer selects either the R Input bus 
or the floating-point ALU's F port as the input to register R. 
Selection is controlled by I4 — a LOW selects the R Input bus; 
a HIGH selects the ALU F port. The ALU port S input 
multiplexer selects either register S or register F as the input to 
the floating-point ALU's S port. Selection Is controlled by I3 — 
a LOW selects register S; a HIGH selects register F. 

Data selected by I3 and I4 is described In Table 2. When 
registers R and S are transparent (FTq = HIGH), multiplexer 
select I4 must be kept LOW, so that the register R Input 
multiplexer selects Rq - R31 . When register F is transparent 
(FTi = HIGH), multiplexer select I3 must be kept LOW, so that 
the ALU port S input multiplexer selects register S. 
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TABLE 2. MUX SELECT 



TABLE 3. I/O MODE SELECTION 



I3 


Data selected for floating-point ALU S port 





Register S 


1 


Register F 


U 


Data selected for register R input 





R bus 


1 


Fioating-point ALU port F 



I/O Modes 

Tlie Am29325 datapath can be configured In one of tliree I/O 
modes: a 32-bit, two-input bus mode; a 32-bit, single-input bus 
mode; and a 16- bit, two-input bus mode. These modes affect 
only ttie manner in which data is delivered to and taken from 
the Am29325; operation of the floating-point ALU is not 
altered. The I/O mode is selected with the ONEBUS and SI 6/ 
32 controls. Table 3 lists the control codes needed to involve 
each I/O mode. 



S16/32 


ONEBUS 


I/O Mode 





1 
1 




1 


1 


32-bit, two-input-bus mode 
32-bit, single-input-bus mode( * ) 
16-bit, two-input-bus mode( * ) 
Illegal I/O mode selection value 



*FTo must be held LOW in this mode (see text). 
32-Bit, Two-Input Bus Mode 

In this I/O mode, the R and S buses are configured as 
independent 32-bit input buses, and the F bus Is configured as 
a 32-blt output bus. Figure 2 is a functional block diagram of 
the Am29325 in this I/O mode. 

R and S operands are taken from their respective input buses 
and clocked into the R and S registers on the LOW-to-HIGH 
transition of CLK. Register F is also clocked on the LOW-to- 
HIGH transition of CLK. Figure 5(a) depicts typical I/O timing 
in this mode. 



R BUS 4/ / 
S BUS /[/ ^'/ 



ENRC3- / 

clkCO-4^ 



ONEBUS ( = LOW) r~> — yi— 
S16/32 ( = LOW) r~^ y 



-y- 



BEtZD-V- 



F BUS /[/ / 



Rfl-Rsi 



32 '- _ 

/ S0-S31 



-^ 
■^ 



C: 



Z7 



3 c 



R S 

FLOATING-POINT 

ALU 

F 



CLK- 



-^ 



> 



4^^ai3 



/ Fo-Fj, 



-^ 



BD007050 



Figure 2. Functional Block Diagram for the 32-Bit, Two-input Bus Mode 
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32-Bit, Single-Input Bus Mode 



In this I/O mode, the R and S buses are connected to a single 
32-bit multiplexed input data bus; the F bus is configured as an 
independent 32-bit output bus. Figure 3 is a functional block 
diagram of the Am29325 in this I/O mode. Note that both the 
R and S bus lines must be wired to the input bus. 

R and S operands are multiplexed onto the input bus by the 
host system. The S operand Is clocked from the input bus into 
a temporary holding register on the HIGH-to-LOW transition of 
CLK and is transferred to register S on the LOW-to-HIGH 



transition of CLK. The R operand is clocked from the input bus 
into register R on the LOW-to-HIGH transition of CLK. Register 
F is clocked on the LOW-to-HIGH transition of CLK. Figure 
5(b) depicts typical I/O timing in this nxxle. 

When placed in this I/O mode, the data path will not function 
properly if the R and S registers are made transparent. 
Therefore, input register feedthrough control FTq must be held 
LOW in this mode. 
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Figure 3. Functional Block Diagram for the 32-Bit, Single-Input Bus Mode 
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16-Blt, Two-Input Bus Mode 

In this I/O mode, the B and S buses are configured as 
independent 16-bit input buses, and the F bus is configured as 
a 16-bit output bus. Figure 4 is a functional block diagram of 
the Am29325 in this I/O mode. Note that the 16 least- 
significant bits (LSBs) and 16 most-significant bits (MSBs) of 
the R, S, and F buses must be wired to their respective system 
buses in parallel. 

Thirty-two-l)it operands are passed along the 16-bit data 
buses by time-multiplexing the 16 LSBs and 16 MSBs of each 
32-bit word. For the R input bus, the host system multiplexes 
the 16 LSBs and 16 MSBs of the R operand onto the 16-btt R 
bus. The 16 LSBs of the R operand are stored in a temporary 
holding register on the HIGH-to-LOW transition of CLK. The 1 6 
MSBs are clocked into register R on the LOW-to-HIGH 
transition of CLK; at the same time, the 16 LSBs are 
transferred from the temporary holding register to register R. 
Transfer of data from the S input bus to the S register takes 
place in a similar fashion. Register F is clocked on the LOW- 
to-HIGH transition of CLK. Circuitry internal to the Am29325 
multiplexes data from register F onto the 16-bit output bus by 
enabling the 16 LSBs of the F output bus when CLK is HIGH, 
and enabling the 16 MSBs of the F output bus when CLK is 
LOW. Figure 5(c) depicts typical I/O timing in this mode. 

When placed in this I/O mode, the data path will not function 
properly if the R and S registers are made transparent. 
Therefore, input register feedthrough control FTq must be held 
LOW In this mode. Caution must also be taken in controlling 
the register R input multiplexef control line, I4, in this I/O 
mode. I4 should be changed only when CLK is HIGH, in 



addition to meeting the setup and hold time requirements 
given in the Switching Characteristics section. 

Operation In IEEE Mode 

When Input signal IEEE/DEC is HIGH, the IEEE mode of 
operation is selected. In this mode the Am2g325 uses the 
floating-point format set forth in the IEEE Proposed Standard 
for Binary Floating-Point Arithmetic, P754. In addition, the 
IEEE mode complies with most other aspects of single- 
precision floating-point operatk>n outlined in the proposed 
standard — differences are discussed in Appendix A. 

IEEE Floating-Point Format 

The IEEE single-precision floating-point word is 32 bits wide, 
and is arranged in the format stiown in Figure 6. The floating- 
point word is divided into three fields: a single-bit sign, an 8-bit 
biased exponent, and a 23-bit fractton. 

The sign bit indicates the sign of the floating-point number's 
value. Non-negative values have a sign of 0; negative values, 
a sign of 1. The value zero may have either sign. 

The biased exponent is an 8-bit unsigned integer field repre- 
senting a multiplk;ative factor of some power of two. The bias 
value Is 127. If, for example, the multiplicative factor for a 
floating-point number is to be 2^, the value of the biased 
exponent would be a + 127; "a" is called the true exponent 

The fraction is a 23-bit unsigned fraction field containing the 
23 LSBs of the floating-point number's 24-bit mantissa. The 
weight of fraction's MSB is 2" ^ the weight of the LSB is 2"^^. 
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Figure 4. Functional Block Diagram for tlie 16-Blt, Two-Input Bus Mode 



4-36 



A floating-point number is evaluated or interpreted per the 
following conventions: 

let s = sign bit 

e = biased exponent 
f = fraction 

if e = and f = 0...value = (-1)=*(0) ( + 0, -0) 
if e = and f =^ 0...value = denormalized number 



if 0<e<255...value = (-ir 

(normalized number) 
if e = 255 and f = 0.. .value - 
if e = 255 and f=^O...value = 



(2«-i'=T(lf) 

(_1)S.(oo) ( + oo^ _oo) 

■ not-a-number (NAN) 



Zero: Tfie value zero can have either a positive or negative 
sign. Rules for determining the sign of a zero produced by an 
operation are given in the Sign Bit section. 

Denormalized Number A denormalized numtier represents a 
quantity with magnitude less than 2-^26 [j^^ greater than zero. 



Normalized Number: A normalized number represents a 
quantity with magnitude greater than or equal to r^^ but 
less than 2^28 

Example 1: 

The number + 3.5 can be represented in floating-point 
format as follows: 

+ 3.5 = 11.12X2° 
= 1.112X2^ 

sign - 

biased exponent = 1 io + 127io = 128io 
= IOOOOOOO2 

fraction = 1 1OOOOOOOOOOOOOOOOOOOOO2 

(the leading 1 is implied in the format) 

Ck>ncatenating these fields produces the floating-point viord 
4O6OOOOO16. 
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SIGN 



BIASED 
EXPONENT (E) 



FRACTION (F) 



BITNUMeeR: 31 30 29 28 27 26 25 24 23 



21 20 19 18 



— I — I — I — I — I — I — I — I — I — 1 — I — I — r 

2? jt 25 2* 23 22 2l 20 I 2"' 2-2 2-3 2-< 2-5 
1 1 1 1 1 1-— I I ■ I ' ■ ■ 



T 1 1 1 1 

2-19 2-20 2-21 2-22 2-23 
J I I I I 



VALUE = (-1)5(26-127) (1.F) 



Figure 6. IEEE Mode Single-Precision Floating-Point Format 
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Example 2: 

The number - 1 1 .375 can be represented in floating-point 
format as follows: 

-11.375= -1011.0112X2° 
= -1.0110112X2^ 



slgn = 1 



biased exponent = 3io + 127io = 
= 100000102 



130io 



fraction = 01 101 IOOOOOOOOOOOOOOOOO2 

(the leading 1 is implied in the format) 

Concatenating these fields produces the floating-point word 
CI 3600001 6. 



Infinity: Infinity can have either a positive or negative sign. 
The way in which infinities are interpreted is determined by the 
state of the projective/affine mode select, PROJ/AFF. 

Not-a-Number A not-a-number, or NAN, does not represent 
a numeric value, but is interpreted as a signal or symbol. NANs 
are used to indicate invalid operations, and as a means of 
passing process status information through a series of calcula- 
tions. NANs arise in two ways: 1 ) they can be generated tjy the 
Am29325 to indicate that an invalid operation has taken place 
(e.g., °= X 0), or 2) be provided by the user as an input 
operand. There are two types of NANs, signalling and quiet 
(see Figure 7 for formats). 

IEEE Mode Integer Format 

Integer numbers are represented as 32-bit, two's-complement 
words (Figure 8 depicts the integer format). The integer word 
can represent a range of integer values from -2^^ to 2^^ - 1. 
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Figure 7. Signalling and Quiet NAN Formats 
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Figure 8. 32-Bit Integer Format 



Operations 



All eight fioating-point ALU operations discussed in the 
Functional Description section can be performed in IEEE 
mode. Various exceptional aspects of the R PLUS S, R MINUS 
S, R TIMES S. 2 MINUS S, INT-TO-FP, and FP-TO-INT 
operations for this mode are described below. The lEEE-TO- 
DEC and DEC-TO-IEEE operations are discussed separately 
in the lEEE-TO-DEC AND DEC-TO-IEEE Operations section. 



Operations with MANs: NANs arise in two ways: 1) they can 
be generated by the Am29325 to indicate that an invalid 
operation has taken place (e.g., » x 0), or 2) be provided by 
the user as an input operand. There are two types of NANs, 
signalling and quiet (see Figure 7 for formats). 

Signalling NANs set the invalid operation flag when they 
appear as an input operand to an operation. They are useful 
for indicating uninitialized variables, or for implementing user- 
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designed extensions to the operations provided. The ALU 
never produces a signalling NAN as the final result of an 
operation. 

Quiet NANs are generated for invalid operations. When they 
appear as an Input operand, they are passed through most 
operations without setting the invalid flag, the floating-point-to- 
integer conversion operation tieing the exception. 

the sign of any input operand NAN is ignored. All quiet NANs 
produced as the final result of an operation have a sign of 0. 

When a NAN appears as an input operand, the final result of 
the operation is a quiet NAN that is created by taking the input 
NAN and forcing bit 22 LOW and bit 21 HIGH. If an operation 
has two NANs as input operands, the resulting quiet NAN is 
created using the NAN on the R port. 

When a quiet NAN is produced as the final result of an invalid 
operation whose input operand or operands are not NANs, the 
resulting NAN will always have the value 7FA0OO0Oi6. 

The NAN flag will be HIGH whenever an operation produces a 
NAN as a final result. 

Example 1: 

Suppose the floating-point addition operation is performed 
with the following input operands: 

R port: 3F800000i6 (1.0*2°) 

S port: 7FC1 234516 (signalling NAN) 

Result: The signalling NAN on the S port is converted to a 
quiet NAN by forcing bit 22 LOW and bit 21 HIGH. 
The operation's final result will be 7FA12345i6. 
Since one of the two input operands is a signalling 
NAN, the invalid flag will be HIGH; the NAN flag will 
also be HIGH. 

Example 2: 

Suppose the floating-point multiplication operation is per- 
formed with the following input operands: 

R port: FFF1IIII16 (signalling NAN) 
S port: 7FC22222i6 (quiet NAN) 

Result: Since both input operands are NANs, the NAN on 
the R port is chosen for output. In addition to forcing 
bit 22 LOW, the sign bit (bit 31) is set LOW (bit 21 is 
already HIGH, and need not be changed). The 
operation's final result will be 7FB11111i6' Since 
one of the two input operands is a signalling NAN, 
the invalid flag is HIGH; the NAN flag will also be 
HIGH. 

Example 3: 

Suppose the floating-point subtraction operation is per- 
formed with the following input operands: 

R port: FF8OOOOI16 (quiet NAN) 
S port: 7F8OOOOO16 ( + ~) 

Result: To create the final result, the quiet NANs sign bit (bit 
31) is forced LOW and bit 21 is forced HIGH (bit 22 
is already LOW, and need not be changed). The final 
result will be 7FA00001t6. The NAN flag will be 
HIGH. 

Operations with Denormallzed Numbers: The proposed 
IEEE standard incorporates denormallzed numbers to allow a 
means of gradual underflow for operations that produce non- 
zero results too small to be expressed as a normalized 
floating-point number. The Am29325 does not support gradual 
underflow. If a floating-point operation produces a non-zero 
rounded result that is not large enough to be expressed as a 
normalized floating-point number, the final result will be a zero 



of the same sign; the inexact, underflow, and zero flags will be 
HIGH. If an input operand Is a denormallzed number, the 
floating-point ALU will assume that operand to be a zero of the 
same sign. 

Operations Producing Overflows: If an operation has a finite 
input operand or operands, and if the operation produces a 
rounded result that is too large to fit In the destination format, 
the operation is said to have overflowed. 

A floating-point overflow occurs if an R PLUS S, R MINUS S, R 
TIMES S, or 2 MINUS S operation with finite input operand(s) 
produces a result which, after rounding, has a magnitude 
greater than or equal to 2^^^. Positive or negative infinity will 
appear as the final result If the rounded result is positive or 
negative, respectively, and the overflow and inexact flags will 
be HIGH. 

Integer overflow occurs when the floating-point-to-integer 
conversion operation attempts to convert a number which, 
after rounding, is greater than 2^'' - 1 or less than -2^''. The 
final result will be quiet NAN 7FA00000i6, and the invalid 
operation and NAN flags will be HIGH. Note that the overflow 
and inexact flags remain LOW for Integer overflow. 

Operations Producing Underflows: If an operation produces 
a floating-point rounded result having a magnitude too small to 
be expressed as a normalized floating-point number, but 
greater than zero, that operation Is said to have underflowed. 
Underflow occurs when an R PLUS S, R MINUS S, or R 
TIMES S operation produces a result which, after rounding, 
has a magnitude in the range: 



< magnitude < 2' 



■126 



In such cases, the final result will be +0 (OOOOOOOO-ie) if the 
rounded result is non-negative, and -0 (8OOOOOOO16) if the 
rounded result Is negative. The underflow, inexact, and zero 
flags will be HIGH. 

Underflow does not occur if the destination format is integer. If 
the infinitely precise result of a floating-point-to-integer con- 
version has a magnitude greater than and less than 1, tiut 
the rounded result is 0, the underflow flag remains LOW. 



Operations with Infinities: In most cases, positive and 
negative infinity are valid inputs for the R PLUS S, R MINUS S, 
R TIMES S, and 2 MINUS S operations. Those cases for which 
infinities are not valid inputs for these operations are listed in 
Table 4, 

Infinities in IEEE mode can be handled either as proje ctive or 
afflne. The projective mode is selected when PROJ /AFF is 
HIGH; the affine mode is selected when PROJ/SFF is LOW. 
The only differences between the modes that are relevant to 
Am29325 operation occur during the addition and subtraction 
of infinities: 



Operation 


Afflne 
Mode 


Projective Mode 


(+«) + (+=) 


Output -^<» 


Output 7FA0000016 

(quiet NAN), set invalid and 

NAN flags 


(-■») + (-") 


Output -'» 


Output 7FA0000016 

(quiet NAN), set invalid and 

NAN flags 


(+«)_(_ 00) 


Output +«■ 


Output 7FAOOOOO16 

(quiet NAN), set invalid and 

NAN flags 


(.«,)_ (+00) 


Output -» 


Output 7FAOOOOO16 

(quiet NAN), set invalid and 

NAN flags 
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If an R PLUS S, R MINUS S, or 2 MINUS 8 operation has 
infinity as an input operand or operands, the final result, if 
valid, is presumed to be exact. For example, adding + °° and 
2.0 will produce a final result of +~; sines the result is 
considered exact, the inexact flag remains LOW. 

Invalid Operations: If an input operand is invalid for the 
operation to be performed, that operation is considered 
invalid. When an invalid operation Is performed, the floating- 
point ALU produces a quiet NAN as the final result, and the 
invalid operation flag goes HIGH. Table 4 lists the cases for 
which the invalid flag is HIGH in IEEE mode, and the final 
results produced for these operations. 

TABLE 4. IEEE MODE INVALID OPERATIONS 



Operations +0 + (-0) and -0 + (+0) produce a result of 0, 
with the sign of the result determined by the table above. 

The operation + + (+ 0) produces a final result of + 0; the 
operation -0 + (-0) produces a final result of,-0. 

R MINUS S: The operations + x - (+ x) and -x - (-x) produce a 
final result of zero; the sign of the zero is dependent on the 
rounding mode: 



Rounding Mode 


Sign of Result 


Round to nearest 





Round toward -<» 


1 


Round toward +«> 





Round toward 






Operation 



R PLUS S 



R PLUS S 



R MINUS S 



R MINUS S 



R TIMES S 



R PLUS S 
R MINUS S 
R TIMES S 



2 MINUS S 



FP-TO-INT 



Input Operand 



or (-~) + (+~) 



(+ oo) + (+ ~) 

or (-<») + (-00) (Note 1) 



or (-<»)- (-°°) 



or (-=■ 



,)_(_oo) 

!-(+<») (Note 1) 



(+0)*(+~) 
or (+0)*(-'») 
or (-0)*(+~) 
or (-0) * (-<») 



R or S is a signalling 
NAN 



S is a signalling NAN 



FP-TO-INT 



R is a signalling or 
quiet NAN 



R > 2^^ - 1 
or R< -(2^^) 



Final Result 



7FA0000016 
(quiet NAN) 



7FA0000016 
(quiet NAN) 



7FA0000016 
(quiet NAN) 



7FA0000016 
(quiet NAN) 



7FA0000016 
(quiet NAN) 



(Note 2) 



(Note 2) 



(Note 2) 



7FA0000016 
(quiet NAN) 



Notes: 1. These cases are invalid in projective mode only. 

2. Results for these operations are described in the Operations 
with NANs section. 

The Sign Bit 

For most floating-point operations,- the sign bit of the final 
result is unambiguous; i.e., there is only one sign bit value that 
yields a numerically correct result. Operations that produce an 
infinitely precise result of zero, however, present a problem, as 
the IEEE floating-point format allows for representation of both 
+ and -0. The following rules can be used to determine the 
signs of zero produced in such cases. 

R PLUS S: The operations + x + (-x) and -x -h (+ x) produce a 
final result of zero; the sign of the zero is dependent on the 
rounding mode: 



Operations + - (+0) and -0 - (-0) produce a result of 0, with 
the sign of the result determined by the table above. 

The operation -i-0-(-0) produces a final result of -l-O; the 
operation -0-(+0) produces a final result of -0. 

R TIMES S: The Sign of any multiplication result other than a 

NAN is the exclusive OR of the signs of the input operands. 

Therefore, if x Is non-negative, 

+ times -I- X produces a final result of +0, 

+ times -X produces a final result of -0, 

-0 times -I- X produces a final result of -0, 

-0 times -X produces a final result of +0. 

2 MINUS S: If S equals 2, the final result is -0 for the round 
toward -" mode, and -i-O for all other rounding modes. 

Rounding 

Rounding is perfomied whenever an operation produces an 
infinitely precise result that cannot be represented exactly in 
the destination format. For example, suppose a floating-point 
operation produces the infinitely precise result: 

1.1010101 01 01010101010101 \01 X 2^. 

In this example, the fraction portion of the mantissa has 25 
bits; the IEEE floating-point fonnat can accommodate only 23. 
The backslash (\) in the mantissa represents the boundary 
between the first 23 bits of the fraction and any remaining bits. 
Rounding is the process by which this result is approximated 
by a representation that fits the destination format. 

There are four rounding modes in IEEE mode: 1) round to 
nearest, 2) round toward +"', 3) round toward -~, and 4) 
round toward 0. The rounding mode is chosen using the 
rounding mode select lines, RNDq and RNDi. Table 5 lists the 
select states needed to obtain the desired rounding mode. 

TABLE 5. ROUNDING MODE SELECT 



Rounding Mode 



Sign of Final Result 



RNDi 



RNDo 



Rounding Mode 



Round to nearest 



Round to nearest 



Round toward -'» 



Round toward 



Round toward +~ 



Round toward -i- = 



Round toward 



Round toward 
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Round to Nearest: In this rounding mode the Infinitely precise 
result of an operation is rounded to the closest representation 
that fits in the destination format. If the infinitely precise result 
is exactly halfway between two representations, it is rounded 
to the representation having an LSB of zero. Rounding is 
performed both for floating-point and integer destination 
formats. 

Figure 9 illustrates four examples of the round-to-nearest 
process for operations having a floating-point destination 
format. The infinitely precise result of an operation is repre- 
sented by an "X" on the number line; the black dots on the 
number line indicate those values that can be represented 
exactly in the floating-point format. 

Example 1: 

In Figure 9(a), the infinitely precise result of an operation is: 

220 + 2"* + 2-5 = 1 .00000000000000000000000\1 1 x 2^° 

The result is rounded to the closest representable floating- 
point value, 

gZO ^. 2-3 = 1.00000000000000000000001 X 2^° 



Example 2: 

In Figure 9(b), the infinitely precise result of an operation is: 

220_2-4 + 2-8 = 

1.111 11111 1111 1111 111111l\0001x2^^ 

This result is rounded to the closest representable floating- 
point value, 



,20 



2-'' =1.11 1111 1111 mill 111111 1x2^ 



Example 3: 
In Figure 9(c), the infinitely precise result of an operation is: 
_(220 + 2-3 -I- 2"") 
= - 1 .00000000000000000000001 \1 X 2^° 

This result is exactly halfway between two representable 
floating-point values. Accordingly, it is rounded to the 
closest representation with an LSB of zero, or 

_(220 + 2*2-3) = -1.00000000000000000000010x22'' 

Example 4: 

In Figure 9(d), the infinitely precise result of an operation Is: 

220 ^. g.g-S = .| .0000000000000000000001 1 X 2^" 

This result can be represented exactly in the floating-point 
format, and is left unaltered by the rounding process. 
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Figure 9. Floating-Point Rounding Examples for Round-to-Nearest Mode 
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Figure 10 illustrates four examples of the round-to-nearest 
process for operations having an integer destination format. 
The infinitely precise result of an operation is represented by 
an "X" on the number line; the black dots on the number line 
indicate those values that can be represented exactly in the 
integer format. 

Example 1: 

In Figure 10(a), the infinitely precise result of an operation is: 

210_2-2 = 00...001111 111111.11 

The result is rounded to the closest representable integer 
value, 

21° = 00...010000000000 

Example 2: 

In Figure 1 0(b), the infinitely precise result of an operation is: 

jlO + 20 -H 2-3 = 00...01000C000001.001 



This result Is rounded to the closest representable integer 
value, 

210 + 20 = 00...01 0000000001 

Example 3: 

In Figure 10(o), the infinitely precise result of an operation is: 

_(210 + 2O + 2"^) = -11...101 1111 11110.1 

This result is exactly halfway between two representable 
integer values. Accordingly, it is rounded to the closest 
representation with an LSB of zero, or 



_(2iu + 2*2") = 11...101111 111110 

Example 4: 

In Figure 10(d), the infinitely precise result of an operation is: 

2IO + 3-20 = 0O...OIOOOOOOOOII 

This result can be represented exactly in the integer format, 
and is left unaltered by the rounding process. 
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Figure 10. Integer Rounding Examples for Round-to-Nearest Mode 
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Round Toward -=: In this rounding mode the result of an 
operation is rounded to the closest representation that is less 
than or equal to the infinitely precise result, and which fits the 
destination format. Rounding is performed both for floating- 
point and integer destination formats. 

Figure 11 illustrates four examples of the round toward -"» 
process for operations having a floating-point destination 
format. The infinitely precise result of an operation Is repre- 
sented by an "X" on the number line; the black dots on the 
number line indicate those values that can be represented 
exactly in the floating-point format. 

Example 1: 

In Figure 1 1 (a), the infinitely precise result of an operation is: 

220 + 2-* + 2-5 = ., .OOOOOOOOOOOOOOOOOOOOOOON1 1 x Z^" 

This result cannot be represented exactly in floating-point 
format, and is rounded to the next-smaller floating-point 
representation: 

2^ = 1.00000000000000000000000 X 2^° 

Example 2: 

In Figure 1 1 (b), the infinitely precise result of an operation is: 



2^u _ 2"* + 2-" » 

1.11 111111111 111111111 lAOOOIx 2^^ 

This result cannot be represented exactly In floating-point 
format, and is rounded to the next-smaller floating point 
representation: 

22°-2-'' = 1.11111 11111 11 1111 1111 111x2^^ 
Example 3: 
In Figure 1 1 (c), the infinitely precise result of an operation is: 
-(2^ + 2^3 + 2"'*) = 
-1. 00000000000000000000001 \1 x2^° 

This result cannot be represented exactly in floating-point 
format, and is rounded to the next-smaller floating-point 
representation. 

_(220 + 2«2-3) = -1.00000000000000000000010x2^0 

Example 4: 

In Figure 1 1 (d), the infinitely precise result of an operation is: 

220 -H 3*2-3 = 1 .0000000000000000000001 1 X 2^° 

This result can be represented exactly in the floating-point 
format, and is left unaltered by the rounding process. 
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Figure 11. Floating-Point Rounding Examples for Round Toward -°° Mode 
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Figure 12 illustrates four examples of ttie round toward -«> 
process for operations having an integer destination format. 
Tlie infinitely precise result of an operation is represented by 
an "X" on tfie numtier line; the blacl< dots on ttie number line 
indicate those values that can be exactly represented in the 
integer format. 

Example 1: 

In Figure 12(a), the infinitely precise result of an operation Is: 

2lO_2-2 = o0...001 11111 11 11.11 

The result is rounded to the next-smaller representable 
integer value, 

210-2° = 00...001111111111 

Example 2: 

In Figure 12(b), the infinitely precise result of an operation is: 



2l0 + 2" + 2- 



p-3 = 



00...01 0000000001 .001 



This result is rounded to the next-smaller representable 
integer value, 

2IO + 2° = 00...01 0000000001 

Example 3: 

In Figure 1 2(c), the infinitely precise result of an operation is: 

_(210 + 2" + 2-1) = 11. ..101111 1111 10.1 

This result is rounded to the next-smaller representable 
integer value: 

-(2^° + 2*2°) = 11...101111 111110 

Example 4: 

In Figure 1 2(d), the infinitely precise result of an operation is: 

jio + 3.2O = OO...OIOOOOOOOO1 1 

This result can be represented exactly in the integer format, 
and is unaltered by the rounding process. 
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Figure 12. Integer Rounding Examples for Round Toward -°° Mode 
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Round Toward +>»: In this rounding mode the result of an 
operation is rounded to the closest representation that is 
greater than or equal to the infinitely precise result, and which 
fits the destination format. Rounding is performed both for 
floating-point and integer destination formats. 

Figure 13 illustrates four examples of the round toward +°° 
process for operations having a floating-point destination 
format. The infinitely precise result of an operation is repre- 
sented by an "X" on the number line; the black dots on the 
number line indicate those values that can be represented 
exactly in the floating-point format. 

Example 1: 

In Figure 13(a), the infinitely precise result of an operation is: 

220 + 2-t + 2-5 = 1 ,00000000000000000000000\1 1 X 2^° 



This result cannot be represented exactly in floating-point 
format, and is rounded to the next-larger floating-point 
representation: 

220 + 2-3 = 1.00000000000000000000001 X ^ 

Example 2: 

In Figure 13(b), the infinitely precise result of an operation is: 



220 _ 2-4 + 2-8 = 

1.11 11111 1111 1111 11 1111 1l\0001x2^^ 

This result cannot be represented exactly in floating-point 
format, and is rounded to the next-larger floating point 
representation: 

2^ = 1 .00000000000000000000000 X 2^ 

Example 3: 

In Figure 13(c), the infinitely precise result of an operation is: 

_(220 + 2-3 + 2-4) _ 

- 1 .00000000000000000000001 \1 X 2^° 

This result cannot tie represented exactly in floating-point 
format, and is rounded to the next-larger floating-point 
representation. 

- (2^0 -t- 2-3) = - 1 .0000000000000000000001 X ^ 

Example 4: 

In Figure 13(d), the infinitely precise result of an operation is: 

220 + 3.2-3 = 1.00000000000000000000011 yi.2^. 

This result can be represented exactly in the floating-point 
format — no rounding takes place 
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Figure 13. Floating-Point Rounding Examples for Round Toward +°° Mode 
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Figure 1 4 illustrates four examples of the round toward -i- °° 
process for having an integer destination format. The infinitely 
precise result of an operation is represented by an "X" on the 
number line; the black dots on the number line indicate those 
values that can be exactly represented in the Integer format. 

Example 1: 

In Figure 14(a), the infinitely precise result of an operation is: 

2H>-2-2 = 00...001111111111.11 

The result is rounded to the next-larger representable 
integer value, 

2^° = 00...01 0000000000 

Example 2: 

In Figure 14(b), the infinitely precise result of an operation is: 

glO + 2° + 2-3 = 00...01 0000000001 .001 



This result is rounded to the next-larger representable 
integer value, 

jlO + 2'2° = 00...010000000010 

Example 3: 

In Figure 14(c), the infinitely precise result of an operation is: 

-(2^° -H 2° + 2"'') = 11.101 1111 11110.1 

This result is rounded to the next-larger representable 
integer value: 

-(2^*'-H2°) = 11...1011111111110 

Example 4: 

In Figure 1 4(d), the infinitely precise result of an operation is: 

2IO + 3.2O = 00...OI 000000001 1 

This result can be represented exactly in the integer 
format — no rounding takes place. 
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Figure 14. Integer Rounding Examples for Round Toward +°° Mode 
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Round Toward 0: In this rounding mode tlie result of an 
operation is rounded to the closest representation whose 
magnitude is less than or equal to the infinitely precise result, 
and which fits the destination format. Rounding is performed 
both for floating-point and integer destination formats. 

Figure 15 illustrates four examples of the round toward 
process for operations having a floating-point destination 
format. The Infinitely precise result of an operation Is repre- 
sented by an "X" on the number line; the black dots on the 
number line Indicate those values that can be represented 
exactly in the floating-point format. 

Example 1: 
In Figure 1 5(a), the infinitely precise result of an operation is: 
gZO + 2-4 + 2-5 - 
1.00000000000000000000000\11 x22° 

This result cannot be represented exactly in floating-point 
format, and is rounded to: 

2^0 - 1 .00000000000000000000000 X 2^ 



Example 2: 
In Figure 1 S(b), the infinitely precise result of an operation is: 



oZO. 



r* + 2-o 



1.1111111111111111 111111l\001x 2^^ 

This result cannot be represented exactly in floating-point 
format, and is rounded to: 

2^-2"'' -1.11111 1111 11111 11111 1111x2^^ 

Example 3: 

In Figure 1 5(c), the Infinitely precise result of an operation is: 

.(gzo + a-s + r-*) - 

- 1 .00000000000000000000001 \1 X 2^° 

This result cannot be represented exactly in floating-point 
format, and is rounded to: 

-{220 + 2-3) - - 1 .00000000000000000000001 X ^ 

Example 4: 

In Figure 1 5(d), the Infinitely precise result of an operation is: 

220 + 3«r3 - 1.00000000000000000000011 x220 

This result can be represented exactly In the floating-point 
format, and is unaffected by the rounding process. 
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Figure 15. Floating-Point Rounding Examples for Round Toward Mode 
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Figure 16 illustrates four examples of the round toward 
process icr operations navinQ an integer desunaiion lOrrriai. 
The infinitely precise result of an operation is represented by 
an "X" on the number line; the black dots on the number line 
indicate those values that can be exactly represented In the 
Integer format. 

Example 1: 

In Figure 1 6(a), the Infinitely precise result of an operation is: 

2IO -2-2 = 00...001 11111 1111.11 

The result Is rounded to: 



5I0 oO- 



00...001111111111 



Example 2: 
In Figure 16(b), the Infinitely precise result of an operation is: 
2IO + 2" + 2-3 = 00...010000000001.001 



The result is rounded to: 

2IO + 2° = 0O...OIOOOOOOOOOI 
Example 3: 

In Figure t6(c), the infinitely precise result of an operation Is: 

_(210 + 2° + 2-1) = 11...101 1111.111 10.1 

The result is rounded to: 

-(21° + 2°) = 11...101111111111 
Example 4: 

In Figure 16(d), the infinitely precise result of an operation is: 

2IO + 3.2O = 0O...OI 000000001 1 

This result can be represented exactly in the Integer forniat, 
and is unaffected by the rounding process. 
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Flag Operation 

The Am29325 generates six status flags to monitor floating- 
point processor operation. The following Is a summary of flag 
conventions in IEEE mode: 

invalid Operation Flag: The invalid operation flag Is HIGH 
when an input operand is invalid for the operation to be 
performed. Table 4 lists the cases for which the invalid 
operation flag is HIGH in IEEE mode, and the corresponding 
final result. In cases where the invalid operation flag is HIGH, 
the overflow, underflow, zero, and Inexact flags are LOW; the 
NAN flag will be HIGH. 

Overflow Rag: The overflow flag is HIGH If an R PLUS S, R 
MINUS S, R TIMES S, or 2 MINUS S operation with finite input 
operand(s) produces a result which, after rounding, has a 
magnitude greater than or equal to 2^^^. The final result will 
be +<» or ->*'. 



Underflow Flag: The underflow flag is HIGH if an R PLUS S, 
R MINUS S, or R TIMES S operation produces a result which, 
after rounding, has a magnitude in the range: 
0< magnitude < 2-126. 



The final result will be + (OOOOOOOO1 e) if the rounded result is 
non-negative, and -0 (BOOOOOOOie) if the rounded result is 
negative. 

Inexact Flag: The Inexact flag Is HIGH If the final result of an 
R PLUS S, R MINUS 8, R TIMES S, 2 MINUS S, INT-TO-FP, or 
FP-TO-INT operation is not equal to the Infinitely precise 
result. Note that If the underflow or overflow flag is HIGH, the 
inexact flag will also be HIGH. 

Zero Flag: The zero flag is HIGH If the final result of an 
operation is zero. For operations producing an IEEE floating- 
point number, the flag accompanies outputs +0 (OOOOOOOOig) 
and -0 (BOOOOOOOie)- For operations producing an integer, 
the flag accompanies the output (OOOOOOOOie). 

NAN Flag: The NAN flag is HIGH if an R PLUS S, R MINUS S, 
R TIMES S, 2 MINUS S, or FP-TO-INT operation produces a 
NAN as a final result. 

Operation in DEC Mode 

When input signal IEEE/ DEC Is LOW, the DEC mode of 
operation Is selected. In this mode the Am29325 uses the 
single-precision floating-point format (floating F) set forth in 
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Digital Equi[Hnent Corporation's VAX Architecture Manual. In 
addition, the DEC mode complies with most other aspects of 
single-precision floating-point operation outlined in the manu- 
al—differences are discussed in Appendix B. 

DEC Floating-Point Format 

The DEC single-precision floating-point word is 32 bits wide, 
and is arranged in the format shown in Figure 1 7. The floating- 
point word is divided into three fields: a single-bit sign, an 8-bit 
biased exponent, and a 23-bit fraction. 

The sign bit indicates the sign of the floating-point number's 
value. Non-negative values have a sign of 0, negative values a 
sign of 1. 

The biased exponent is an 8-bit unsigned integer field repre- 
senting a multiplicative factor of some power of two. The bias 
value is 128. If, for example, the multiplicative factor for a 
floating-point number is to be 2®, the value of the biased 
exponent would be a + 128; "a" is called the true exponent. 

The fraction is a 23-bit unsigned fractional field containing the 
23 LSBs of the floating-point number's 24-bit mantissa. The 
weight of this field's MSB is 2~^; the weight of the LSB is 2"^''. 

A floating-point number is evaluated or interpreted per the 
following conventions: 
let s =sign bit 

e = biased exponent 

f = fraction 

if e = and s = C.value = 
if e = and s = 1... value = DEC-resen/ed operand 
if 0<e<255...value=(-1)'*(2^-''2V(.1f) 
(normalized number) 

Zero: The value zero always has a sign of zero. 

DEC-Reserved Operand: A DEC-resewed operand does not 
represent a numeric value, but is interpreted as a signal or 
symbol. DEC-reserved operands are used to indicate invalid 
operations and operations whose results have overflowed the 
destination format. They may also be used to pass symbolic 
information from one calculation to another. 



Normalized Number A normalized number represents a 
quantity with magnitude greater than or equal to 2"^^^ but 
less than 2^^^. 

Example 1: 

The number +3.5 can be represented in floating-point 
format as follows: 

+ 3.5 = 11.12X2° 
= .1112X2^ 

sign = 

biased exponent = 2io + 128io = 130io 
= IOOOOOIO2 

fraction = IIOOOOOOOOOOOOOOOOOOOOO2 

(the leading 1 is implied in the format) 

Concatenating these fields produces the floating-point word 
416OOOOO16. 

Example 2: 

The number -11.375 can tie represented in floating-point 
format as follows: 

-11.375 = -1011.0112X2° 
= -.10110112X2* 

sign = 1 

biased exponent = 4io + 128io = 132io 
= 100001002 

fraction = OIIOIIOOOOOOOOOOOOOOOOO2 

(the leading 1 is implied in the format) 

Concatenating these fields produces the floating-point word 
C236000016. 

DEC Mode Integer Format 

DEC mods integer format is identical to that of the IEEE mode. 
Integer numbers are represented as 32-bit, two's-complement 
words (Figure 8 depicts the integer format). The integer word 
can represent a range of integer values from -2^^ to 2^' - 1. 



Operations 

All eight floating-point ALU operations discussed in the 
General Description section can be performed in DEC mode. 
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Figure 17. DEC-Mode Floating-Point Format 



Various exceptional aspects of the R PLUS S, R MINUS S, R 
TIMES S, 2 MINUS S, INT-TO-FP, and FP-TO-INT operations 
for this mode are described below. The lEEE-TO-DEC and 
DEC-TO-IEEE operations are discussed separately in the 
lEEE-TO-DEC and DEC-TO-IEEE Operations section. 

Operations with DEC-Reserved Operands: DEC-resen/ed 
operands arise in two ways: 1) they can be generated by the 
Am29325 to Indicate that an invalid operation or floating-point 



overflow has taken place, or 2) be provided by the user as an 
input operand. 

When a DEC-reserved operand appears as an input operand, 
the final result of the operation is the same DEC-reserved 
operand. If an operation has two DEC-resen/ed operands as 
inputs, the DEC-reserved operand on the R port becomes the 
final result. 

The NAN flag will be HIGH whenever an operation produces a 
DEC-reserved operand as a final result. 
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Example 1: 

Suppose the floating-point addition operation is performed 
with the following input operands: 

R port: 4080000016 (0.1*2^) 

S port: 8001234516 (DEC-reserved operand) 

Result This operation produces the DEC-reserved operand 
on the S port, 80012345ie, as the final result. The 
NAN flag will be HIGH. 

Example 2: 

Suppose the floating-point multiplication operation is per- 
formed with the following input operands: 

R port: 8076543216 (DEC-reserved operand) 
8 port: 80000001 16 (DEC-reserved operand) 

Result: Since both input operands are DEC-reserved oper- 
ands, the operand on the R port, 80765432i6, is the 
final result of the operation. The NAN flag will be 
HIGH. 

Operations Producing Overflows: If an operation produces 
a rounded result that is too large to fit in the the destination 
format, that operation is said to have overflowed. 

A floating-point overflow occurs if an R PLUS S, R MINUS S, R 
TIMES S, or 2 MINUS S operation with finite input op8rand(s) 
produces a result which, after rounding, has a magnitude 
greater than or equal to 2^ ^^. The final result in such cases will 
be DEC-reserved operand 8OOOOOOO16; the overflow, inexact, 
and NAN flags will be HIGH. 

Integer overflow occurs when the "floating-point-to-integer" 
conversion operation attempts to convert to integer a floating- 
point number which, after rounding, is greater than 2^^ - 1 or 
less than -2^\ The final result in such cases will be DEC- 
reserved operand 8OOOOOOO16; the invalid operation flag will 
be HIGH. Note that the overflow and inexact flags remain 
LOW for integer overflow. 

Operations Producing Underflows: If an operation produces 
a floating-point result which, after rounding, has a magnitude 
too small to be expressed as a normalized floating-point 
numt>er, but greater than 0, that operation is said to have 
underflowed. Underflow occurs when an R PLUS S, R MINUS 
S, or R TIMES S operation produces a result which, after 
rounding, has the magnitude: 

0< magnitude < 2"^^®. 

The final result in such cases will be (OOOOOOOOie). The 
underflow, inexact, and zero flags will be HIGH. 

Underflow does not occur if the destination format is integer. If 
the infinitely precise result of a floating-point-to-integer con- 
version has a magnitude greater than and less than 1 , but 
the rounded result is 0, the underflow flag remains LOW. 

Invalid Operations: If an input operand is invalid for the 
operation to be performed, that operation is considered 
invalid. There is only one invalid operation in DEC mode: 
performing a floating-point-to-integer conversion on a value 
too large to be converted to an integer. In this case, the final 
result will be DEC-reserved operand 8OOOOOOO16, and the 
invalid operation and NAN flags will be HIGH. 

Sign Bit 

For all operations producing a DEC floating-point result, the 
sign bit of the final result is unambiguous; i.e., there is only one 
sign bit value that yields a numerically correct result. 



Rounding 

There are four rounding modes for DEC operation: 1) round to 
nearest, 2) round toward +<», 3) round towanj -=°, and 4) 
round toward 0. The round toward + ~, round toward -«», and 
round toward modes are performed in a manner identical to 
that for IEEE operation; refer to the Rounding section under 
Operation in IEEE Mode. The round to nearest mode is 
similar to that for IEEE operation, but differs in one respect: for 
the case in which the infinitely precise result of an operation is 
exactly halfway between two representable values, DEC round 
to nearest mode rounds to the value with the larger magni- 
tude, rather than to the value whose LSB is 0. 

Flag Operation 

The Am2g325 generates six status flags to monitor floating- 
point processor operation. The following is a summary of flag 
operation in DEC mode: 

Invalid Operation Flag: The invalid operation flag is HIGH if 
the FP-TO-INT operation is performed on a floating-point 
number too large to be converted to an integer. The final result 
for such an operation will be the DEC-reserved operand 
8OOOOOOO16. 

Overflow Flag: The overflow flag is HIGH if an R PLUS S, R 
MINUS S, R TIMES S, or 2 MINUS S operation produces a 
result which, after rounding, has a magnitude greater than or 
equal to 2^^^. The final result will be the DEC-reserved 
operand 8OOOOOOO16. 

Underflow Flag: The underflow flag is HIGH if an R PLUS S, 
R MINUS S, or R TIMES S operation produces a result which, 
after rounding, has a magnitude in the range: 



< magnitude < 2 



■128 



The final result will be (OOOOOOOO16) in such cases. 

Inexact Flag: The inexact flag is HIGH if the final result of an 
R PLUS S, R MINUS S, R TIMES S, 2 MINUS S, INT-TO-FP, or 
FP-TO-INT operation is not equal to the infinitely precise 
result Note that if the underflow or overflow flag is HIGH, the 
inexact flag will also t>e HIGH. 

Zero Flag: The zero flag is HIGH if the final result of an 
operation is 0. For operations producing an integer or a DEC 
floating-point number, the flag accompanies the output 
(OOOOOOOOis). (It should be noted that any operation produc- 
ing a floating-point in DEC mode will output OOOOOOOO16.) 

MAN Flag: The NAN flag is HIGH if an R PLUS S, R MINUS S, 
R TIMES S, 2 MINUS S, or FP-TO-INT operation produces a 
DEC-reserved operand as the final result. 

lEEE-TO-DEC and DEC-TO-IEEE Operations 

The IEEE-TO-DEC and DEC-TO-IEEE operations are used to 
convert floating-point numbers between the IEEE and DEC 
formats. Both operations work in a manner independent of the 
IEEE/DEC mode control. 

IEEE-TO-DEC Conversion 

The operation converts an IEEE floating-point number to DEC 
floating-point format Most conversions are exact; in no case 
does the round mode have any effect on the final result. There 
are, however, a few exceptional cases: 

a) If the IEEE floating-point input has a magnitude greater than 
or equal to 2^^^, it is too large to be represented by a DEC 
floating-point number. The flnal result will be the DEC- 
reserved operand 8OOOOOOO16; the overflow, inexact, and 
NAN flags will be HIGH. 
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b) If the IEEE floating-point input is a NAN, the final result will 
be the DEC-reserved operand eOOOOOOOie; the invalid and 
NAN flags will be HIGH. 

c) If the IEEE floating-point input Is a denormalized numtier, 
the final result will be a DEC (OOOOOOOie); the zero flag 
will be HIGH. 

d) If the IEEE floating-point input is + or -0, the final result 
will be a DEC (OOOOOOOie); the zero flag will be HIGH. 

DEC-TO-IEEE Conversion 

This operation converts a DEC floating-point number to IEEE 
floating-point format. Most conversions are exact; in no case 
does the round mode have any effect on the final result. There 
are, however, a few exceptional cases: 

a) If the DEC floating-point input is not 0, but has a magnitude 
less than 2'^^^, it is too small to be expressed as a 
normalized IEEE floating-point number. The final result will 
be an IEEE floating-point having the same sign as the 
input (0000000-16 'or positive inputs and 8OOOOOOO16 for 
negative inputs); the underflow, inexact, and zero flags will 
be HIGH. 

b) If the DEC floating-point input is a DEC-reserved operand, 
the result will be quiet NAN TFAOOOO-is; the invalid opera- 
tion and NAN flags will be HIGH. 

c) If the DEC floating-point input is 0, the final result will be 
IEEE floating-point +0 (OOOOOOOie); "le zero flag will be 
HIGH. 



be exercised during circuit tioard design and layout, as with 
any high-performance component. The following is a sug- 
gested layout, but since systems vary widely in electrical 
configuration, an empirical evaluation of the intended layout Is 
recommended. 

The VccT and GNDT pins, which carry output driver switching 
currents, tend to be electrically noisy. The VccE and GNDE 
pins, which supply the ECL core of the device, tend to produce 
less noise, and the circuits they supply may be adversely 
affected by noise spikes on the VccE plane. For this reason, it 
is best to provide isolation between the Vqce and Vqct pins, 
as well as independent decoupling for each. Isolating the 
GNDE and GNDT pins is not required. 

Printed Circuit-Board layout Suggestions 

1 ) Use of a multilayer PC board with separate power ground 
and signal planes is highly recommended. 

2) All VccE and VccT Pins should be connected to the Vcc 
plane. Vcct Pins should be isolated from VccE Pins by means 
of a slot cut in the VccE plane (see Figure 16). By physically 
separating the VccE and VccT Pins, coupled noise will be 
reduced. 

3) All GNDE and GNDT pins should be connected directly to 
the ground plane. 



APPLICATIONS 

Suggestions for Power and Ground Pin 
Connections 

The Am29325 Operates in an environment of fast signal rise 
times and substantial switching currents. Therefore, care must 



4) The VcGT Pins should be decoupled to ground with a 0. 1 -jiF 
ceramic capacitor and a 10-;uF electrolytic capacitor, placed 
as closely to the Am29325 as is practical. Vqce Pins should 
be decoupled to ground in a similar manner. A 
layout is shown in Figure 18. 
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Figure 18. Suggested Printed-Circuit Board Layout (Power and Ground Connections) 
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Figure 19. Ain29325 Thermal Characteristics (Typical) 
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APPENDIX A 



DIFFERENCES BETWEEN THE IEEE 
PROPOSED STANDARD FOR BINARY 
FLOATING-POINT ARITHMETIC AND THE 
Am29325'S IEEE MODE 

When operated in IEEE mode, the Am29325 High-Speed 
Floating-Point Processor complies with the single-precision 
portion of the IEEE Proposed Standard for Binary Floating- 
Point Arithmetic (P754, draft 1 0.0) in most respects. There are, 
however, several differences: 

Denormalized Numbers 

The Am29325 does not handle denormalized numbers. A 
denormalized Input will be converted to zero of the same sign 
before the specified operation takes place. The operation 
proceeds in exactly the same manner as if the input were + 
or -0, producing the same numerical result and flags. 

If the result of an operation, after rounding, has a magnitude 
smaller than 2~^ , the result Is replaced by a zero of the 
same sign. 

Representation of Overflows 

In some rounding modes the proposed IEEE standard requires 
that overflows be represented as the format's most-positive or 
most-negative finite number. In particular: 

-When rounding toward 0, all overflows should produce a 
result of the largest representable finite number with the 
sign of the intermediate result. 

-When rounding toward -<», all positive overflows should 
produce a result of the largest representable positive finite 
number. 

- When rounding toward + °°, all negative overflows should 
produce a result of the largest representable negative finite 
number. 

The Am29325, however, always represents positive overflows 
as -^ "> and negative overflows as -°°, regardless of rounding 
mode. 

Projective Mode 

The proposed IEEE standard provides only for an affine mode 
to control the handling of infinities. The Am29325 provides 



both affine and projective modes; the desired mode can be 
selected by the user. 

Traps 

The proposed IEEE standard stipulates that the user be able 
to request a trap on any exception. The Am29325 does not 
support trapped operation, and behaves as If traps are 
disabled. 

Resetting of Flags 

The proposed IEEE standard states that once an exception 
flag has been set, it is reset only at the user's request. The 
Am29325's flags, however, reflect the status of the most 
recent operation. 

Generation of the Underflow Flag 

The proposed IEEE standard suggests several possible crite- 
ria for determining if underflow occurs. These criteria generate 
underflow flags that differ in subtle ways. The underflow 
criteria chosen for the Am29325 stipulate that underflow 
occurs if: 

a) the rounded result of an operation has a magnitude in the 
range: 



< magnitude < 2" 



and 



b) the final result is not equal to the infinitely precise result. 

Since the Am29325 never produces a denormalized number 
as the final result of a calculation, condition (b) is true 
whenever (a) is true. Note then that the operation of the 
Am29325's underflow flag is somewhat different than that of 
an "IEEE standard" system using the same underflow criteria. 
For example, if an operation should produce an infinitely 
precise result that is exactly 2'^^^, an "IEEE standard" 
system would produce that value as the final result, expressed 
as a denormalized number. Since that system's final result is 
exact, the underflow flag would remain LOW. The Am29325, 
on the other hand, would output zero; since its final result is 
not exact, the underflow flag would be HIGH. 



APPENDIX B 

DIFFERENCES BETWEEN DEC VAX AND 
Am29325 DEC MODE 

Operation in DEC mode complies with most aspects of single- 
precision floating-point operation outlined in the Digital Equip- 
ment Corporation's VAX Architecture Manual. However, there 
are some differences that should be noted: 

Format 

The Am29325's DEC format is: 



sign 


-bit 31 


exponent 


-bits 30-23 


mantissa 


-22-0 



The VAX format is: 



sign 


-bit 15 




exponent 


-14-7 




mantissa 


-bits 6-0, bits 31 


-16 



In both cases, fields are listed from MSB to LSB, with bit 31 
the MSB of the 32-bit word. The Am29325's DEC format can 
be converted to VAX format by swapping the 16 LSBs and 16 
MSBs of the 32-bit word. 

Flags vs. Exceptions 

In DEC VAX operation, certain unusual conditions arising 
during system operation may incur an exception, or an 
indication to the operating system that special handling is 
needed. 

The VAX recognizes a number of arithmetic exceptions. The 
following exceptions are relevant to the operations supported 
by the Am29325: 
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Integer Overflow Trap: indicates that the last operation 
produced an integer overflow. The LSBs of the correct result 
are stored in the destination operand. 



Roating-Point Overflow Trap/Fault: indicates that the last 
operation produced, after normalization and rounding, a float- 
ing-point number with magnitude greater than or equal to 2^^^. 
A trap replaces the destination operand with the DEC- 
reserved operand SpOOOOOOie; a fault leaves the destination 
operand unchanged. 

Floating-Point Underflow Trap/Fault: Indicates that the last 
operation produced, after normalization and rounding, a float- 
ing-point number with magnitude less than 2"^ . A trap 
replaces the destination operand with zero; a fault leaves the 
destination operand unchanged. 

Reserved Operand Fault: indicates that the last operation 
had a reserved operand as an input. The destination operand 
is unchanged. 

The Am29325 does not directly support DEC traps and faults. 
Rather, it indicates unusual conditions by setting one or more 
of the six status flags HIGH, Table D2 describes flag operation 
in DEC mode. 

Integer Overflow 

In cases of integer overflow, the VAX signals the integer 
overflow trap and stores the LSBs of the correct result. The 
Am29325 sets the invalid operation flag and outputs the DEC- 
resen/ed operand SOOOOOOOie- 



Floating-Point Underflow/Overflow Operation 

The VAX Architecture Manual specifies the action to. be taken 
on the destination operand when floating-point underflow or 
overflow is encountered. The Am29325 has no immediate 
control over this destination operand, as it resides somewhere 
off-chip, either In a register or memory location. This isn't so 
much a difference between the VAX specification and 
Am29325 operation as it is a difference in scope. 

The Am29325 responds to floating-point underflow by produc- 
ing a final result of (OOOOOOOO^e); the underflow, inexact, 
and zero flags will tie HIGH. It responds to floating-point 
overflow by producing the DEC-reserved operand SOOOOOOOie 
as the final result; the overflow, inexact, and NAN flags will be 
HIGH. 

Handling of DEC-Reserved Operands 

If an operation has a DEC-reserved operand as an input, the 
Am29325 will produce that operand as the final result. If an 
operation has two input arguments and both are DEC- 
reserved operands, the operand on port R becomes the final 
result. For the VAX, operations with a DEC-reserved operand 
input or inputs do not modify the destination operand. As 
mentioned above, control of the destination operand is be- 
yond the scope of the Am29325's operation. 

Inexact Flag 

The Am29325 provides an inexact flag to Indicate that the final 
result produced by an operation is not equal to the infinitely 
precise result. The VAX does not provide this flag. 



APPENDIX C 

PERFORMING FLOATING-POINT DIVISION 
ON THE Am29325 

While the Am29325 does not have a floating-point division 
instruction, it can be used to evaluate reciprocals. The 
division: 

C = A/B 

can then be performed by evaluating: 

C = A*(1/B) 

Only a modest amount of external hardware is needed to 
implement the reciprocal function. 

The technique for calculating reciprocals is based on the 
Newton-Raphson method for obtaining the roots of an equa- 
tion. The roots of equation: 

F(x) = 

can be found by iteratively evaluating the equation; 



Xi + 1 = Xi 



F(Xi)/F'{xi) 



The process begins by making a guess as to the value of X|, 
and using this guess or "seed" value to perform the first 
iteration. Iterations are continued until the root is evaluated to 
the desired accuracy. The number of iterations needed to 
achieve a given accuracy depends both on the accuracy of the 
seed value and the nature of F(x). 

Now consider the equation: 

F(x) = (1/x) - B 



The root of F(x) is 1 /B. The reciprocal of B, then, can be found 
by using the Newton-Raphson method to find the root of F{x). 
The iterative equation for finding the root is: 

Xi + i=Xi-F(Xi)/F(Xi) 

= Xi-(1/xi-B)/-(xi)-2 
= Xi (2-B*Xi) 



It can be shown that, in order for this iterative equation to 
converge, the seed value xq must fall in the range: 



< xo < 2/B 
2/B < xo < 



If B>0 
if B<0 



For example, if the reciprocal of 3 Is to be evaluated, the seed 
value must be between and 2/3. 

The error of x, reduces quadratically; that is, if the error of x. Is 
e, the error is reduced to oreier e by the next Iteration. The 
number of bits of accuracy in the result, then, roughly doubles 
after every Iteration. While this Is only an approximation of the 
actual error produced, it is a handy rule of thumb for 
determining the number of iterations needed to produce a 
result of a certain accuracy, given the accuracy of the seed. 

Example 1: 

Find the reciprocal of 7.25. 

Solution: 



The 



value must fall In the range: 



< xo < 2/7.25 
or < xo < .275862 

Suppose xo is chosen to be .1: 
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Iteration 1: xi -xq (2-B*xo) 

-.1(2-(7.25) (.1)) 
-.1275 

Iteration 2: X2 = xi (2-B*xi) 

-.1275(2 -(7.25) (.1275)) 
= .1371421875 

Iteration 3: X3 - xa (2 - B*X2) 
-. 1371421875* 

(2-(7.25) (.1371421875)) 
= .1379265230 

Tlie actual value of 1/7.25, to ten decimal places, is 
.1379310345. 

The error after each iteration is: 



Iteration 


X| 


Error to Ten Places 





.1 


-0.0379310345 


1 


.1275 


-0.0104310345 


2 


.1371421875 


-0.0007888470 


3 


.1379265230 


-0.0000045115 



Example 2: 

Find the reciprocal of -.3. 

Solution: 

The seed value must fall in the range: 

2/(-.3)<xo<0 
or -6.66 < xo < 

Suppose Xo is chosen to be -2.0: 



Iteration 1: xi -xq (2-B*xo) 

= -2.0(2 -(-.3) (-2.0)) 
= -2.8 

Iteration 2: xg-xi (2-B*xi) 

--2.8(2 -(-.3) (-2.8)) 
= -3.248 

Iteration 3: X3 = X2 (2-B*X2) 

--3.248(2- (-.3) (-3.248)) 
- -3.3311488 

Iteration 4: X4 = X3 (2 - B*X3) 
- -3.3311488* 

(2-(-.3) (-3.3311488)) 
= -3.333331902 

The actual value of 1/(-.3), to ten decimal places, 
-3.333333333. 

The enot after each iteration is: 



i 


X| 


Error to Ten Places 





-2.0 


1.333333333 


1 


-2.8 


0.533333333 


2 


-3.248 


0.085333333 


3 


-3.3311488 


0.002184533 


4 


-3.333331902 


0.000001431 



In order to implement the Newton-Raphson method on the 
Am29325, some means Is needed to generate the seed used 
in the first iteration. One approach is to place a hardware seed 
look-up table between the R bus and the Am29325; see Table 
CI . A more detailed diagram of the look-up table appears in 
Figure C2. 
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TABLE C1. CONTENTS OF THE SEED EXPONENT PROM 



DEC 


IEEE 


Address (16) 


Data (16) 


Address (16) 


Data (16) 


000 


(Note 1) 


100 


(Note 1) 


001 


(Note 1) 


101 


FC 


002 


FF 


102 


FB 


003 


FE 


103 


FA 


004 


FD 


104 


F9 


005 


FC 


105 


F8 


006 


FB 


106 


F7 


007 


FA 


107 


F6 


008 


F9 


108 


F5 


009 


F8 


109 


F4 


OOA 


F7 


10A 


F3 


OOB 


F6 


10B 


F2 


OOC 


F5 


IOC 


F1 


OOD 


F4 


10D 


FO 


OOE 


F3 


10E 


EF 


OOF 


F2 


10F 


EE 


010 


F1 


110 


ED 


Oil 


FO 


111 


EC 


012 


EF 


112 


EB 


OEE 


13 


1EE 


OF 


OEF 


12 


1EF 


OE 


OFO 


11 


1F0 


OD 


0F1 


10 


1F1 


OC 


0F2 


OF 


1F2 


OB 


0F3 


OE 


1F3 


OA 


0F4 


OD 


1F4 


09 


0F5 


DC 


1F5 


08 


0F6 


OB 


1F6 


07 


0F7 


OA 


1F7 


06 


0F8 


09 


1F8 


05 


0F9 


08 


1F9 


04 


OFA 


07 


1FA 


03 


OFB 


06 


1FB 


02 


OFC 


05 


1FC 


01 


OFD 


04 


1FD 


(Note 2) 


OFE 


03 


1FE 


(Note 2) 


OFF 


02 


IFF 


(Note 2) 



Notes: 1. The reciprocals of these numbers are too large to be represented in the 
selected format. 
2. The reciprocals of these numbers are too small to be represented in 
normalized IEEE format 
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Figure CI. Adding a Hardware Lool<-Up Table to the Am29325 



The look-up table has two sections: a biased exponent look-up 
PROM, and a fraction look-up PROM. The seed-biased 
exponent look-up table is stored in a 512-by-8-bit PROM. This 
table consists of two sections: tlie DEC format section (which 
occupies addresses OOO-OFFig), and the IEEE section 
(which occupies addresses lOO-IFFig. The appropriate 
table will be selected au toma tically if address line Ag is wired 
to the Am29325's IEEE/DEC pin. The equations implemented 
by tfiese table sections are: 

DEC table: seed biased exponent 

= 257io -input biased exponent 

IEEE table: seed biased exponent 

= 253io -input ixased exponent 

Table CI lists the contents of this PROM. 

The seed fraction look-up table is stored in one or more 
PROMs, the number of PROMs depending on the desired 
accuracy of the seed value. The hardware depicted in Figure 



02 uses two 4K-by-8-bit PROMs to implement a fractfon look- 
up table whose inputs are the 12 MSBs of the input argu- 
ment's fraction. These PROMs output the 16 MSBs of the 
seed's fraction field — the remaining 7 bits of fractk)n are set 
to 0. The equation implemented in this tat>le is: 
2 

seed fraction = 1 

1 + input fraction 
wtiere the value of the input fraction falls in the range 

< input fraction < 1 

Note that the seed fraction must also be constrained to fall in 
the range 

< seed fraction < 1 

Therefore, if the input fraction is 0, the corresponding seed 
fraction stored in the table must be .111...1112, not I.O2. The 
same seed fraction look-up table may be used for both IEEE 
and DEO formats. Table C2 contains a partial listing for the 
seed fractran look-up table shown in Figure C2. 



4-58 



TABLE C2. CONTENTS OF THE SEED FRACTION PROMS 



Address (16) 



Value of Input Fraction (10) 



Value of Seed Fraction (10) 



PROM Outputs (16) 


R22-R15 


R14-R7 


FF 


FF 


FF 


EO 


FF 


CO 


FF 


AO 


FF 


80 


FF 


60 


FF 


40 


FF 


20 


FF 


00 


FE 


El 


FE 


CO 


FE 


A1 


FE 


81 


00 


50 


00 


48 


00 


40 


00 


38 


00 


30 


00 


28 


00 


20 


00 


18 


00 


10 


00 


08 



000 
001 
002 
003 
004 
005 
006 
007 
008 
009 
OOA 
OOB 
OOC 



FF6 
FF7 
FF8 
FF9 
FFA 
FFB 
FFC 
FFD 
FFE 
FFF 



0.0 

0.0002441406 

0.0004882812 

0.0007324219 

0.0009765625 

0.0012207031 

0.0014648438 

0.0017089844 

0.0019531250 

0.0021972656 

0.0024414063 

0.0026855469 

0.0029296875 



0.9975585938 
0.9978027344 
0.9980486750 
0.9982910156 
0.9985351563 
0.9987792969 
0.9990234375 
0.9992675781 
0.9995117188 
0.9997558594 



0.9999999999 (see text) 

0.9995118370 

0.9990239150 

0.9985362260 

0.9980487790 

0.9975615710 

0.9970745970 

0.9965878630 

0.9961013650 

0.9956151030 

0.9951290800 

0.9946432920 

0.9941577400 



0.0012221950 
0.0010998410 
0.0009775170 
0.0008552230 
0.0007329590 
0.0006107240 
0.0004885200 
0.0003663450 
0.0002442000 
0.0001220850 



''1 



SIGN 

("31) 



'-i 



BIASED 
EXPONENT 
(R30-R23) 



*7-*0 



D7-D1, 



-'1 



SEED SIGN SEED EXPONENT 



''12 



OF FRACTION 



A|1-Ao 



(2) Ani27S43 4K x 8 
SEED FRACTION PflOItt 



D7-O0 



O7-O0 



-'a 



x's 



f 



SEED FRACTION 



AF004631 



Figure C2. The Hardware Look-Up Table 



With the hardware look-up table in place, the reciprocal of 
value B can be calculated with the following series of 
operations: 

1 ) Place B on both the R and S buses. The 2 : 1 multiplexer at 
the output of the hardware lool<-up table should select the 
output of the look-up table (see Figure C3-A). 

2) Load the seed value xq into register R and load B into 
register S. Select the R TIMES S operation (see Figure 
C3-B). 



3) Load product B'xq into register F. Select the 2 MINUS S 
operation, and select register F as the input to the ALU S 
port (see Figure C3-C). 

4) Load 2 - B'xq into register F. Select the R TIMES S 
operation and select register F as the input to the ALU S 
port (see Figure C3-D). 

5) Load the value xi (xi = xo(2 - B*xo)) into registers R and F. 
Select the R TIMES S operation (see Figure C3-E). 

6) Repeat steps 3 through 5 until the result has the accuracy 
desired. 
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Figure C3-A. Data Flow for Step 1 of the Reciprocal Procedure 
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Figure C3-B. Data Flow for Step 2 of the Reciprocal Procedure 
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BUSS 
8USR ' 



BUSF- 



SEED 

LOOK-UP 

TABLE 



2:1 
MUX 



Rfl-fsi 



2:1 
MUX 



REGISTER R 

[X.] 



PORT 

R 



S0-S31 



REGISTER S 

[B] 



JZ 



2: 1 
MUX 



\\ 



PORT 

S 



ALU 



PORTF 



2-B-Xo 



REGISTER F 

[B'Xo] 



Ain2932S 



F0-F31 



Figure C3-C. Data Flow for Step 3 of the Reciprocal Procedure 
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BUS S 1 


BUS R 






L 

BUSF 
















SEED 

LOOK-UP 

TABLE 




















1 
2:1 
MUX 


















R(,-R31 


S0-S31 




















— 


ij 










MUX 




REGISTER S 

[B] 












X, 














i 




f 1 






REGISTER R 

(X,(X, =Xo(2-B.Xo)] 




2:1 1 11 
MUX ll 


1 


1 






I 






1 


















♦ ♦ 






1 




PORT PORT 

R S 

ALU 
PORTF 






1 






1 
















REGISTER F 

[2-B-Xo] 






1 


1 




Am29325 










F0-F3I 




DF006240 

Figure C3-D. Data Flow for Step 4 of the Reciprocal Procedure 
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BUSS 
BUSR 



BUSF- 



SEED 

LOOK-UP 

TABLE 



2: 1 
MUX 



"o-^si 



1 

2:1 
MUX 



REGISTER R 

(X, {X, = Xo (2-B.Xo))l 



PORT 
R 



S0-S31 



REGISTER S 

[B] 



2:1 
MUX 



J L 



PORT 
S 



ALU 



PORTF 



B'X, 



REGISTER F 
tX,(X, = Xo(2-B.Xom 



Am29325 



F0-F31 



Figure C3-E. Data Flow for Step 5 of the Reciprocal Procedure 
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A tabular description of the operations above is given in Table 
C3. The following examples, performed in IEEE format, 
illustrate the process. 

Example 1: 

Find the reciprocal of 25,3. 

Solution: The IEEE floating-point representation for 25,3 is 
41CA6666i6- The reciprocal process is begun by 
feeding this value to both the seed look-up table 



and port S. The look-up table produces the value 
.0395278910 (3D21E800i6). The reciprocal Is 
evaluated using the procedure described above; 
register values for each step are given In Table C4. 
The expected result, to the precision of the float- 
ing-point word, is .0395256910 (3D21E5B1i6)- In 
this case the expected result is produced after the 
first iteration. All subsequent iterations produce the 
same result, and are therefore unnecessary. 





TABLE C3. SEQUENCE OF EVENTS FOR EVALUATING RECIPROCALS 


Clock 
Cycle 


I0-I2 


i3 


u 


€nM 


ENS 


ENf 


Register R 


Register S 


Register F 


1 


Y 


X 











X 


- 


- 


- 


2 


R TIMES S 





X 


1 







Xo 


B 


- 


3 


2 MINUS S 


1 


X 


1 







Xo 


B 


B«Xo 


4 


R TIMES S 


1 


1 










Xo 


B 


2-B«Xo 


5 


R TIMES S 





X 


1 







Xi(-Xo(2-B«Xo)) 


B 


Xi(-Xo(2-B.Xo)) 


6 


2 MINUS S 


1 


X 


1 







Xl 


B 


B«Xi 


7 


R TIMES 3 


1 


1 










Xl 


B 


2-B«Xi 


8 


R TIMES S 





X 


1 







X2(-Xi(2-B*Xi)) 


B 


X2(=Xi(2-B«Xi)) 



First 
iteration 



Second 
iteration 



DON'T CARE 



TABLE C4. INPUT BUS AND REGISTER VALUES FOR EXAMPLE 1 



Clock 
Cycle 


R Input 


S Input 


Register R 


Register S 


Register F 


1 


3D21E800 
(.03952789) 


41CA666616 
(25.3) 


- 


- 


- 


2 


- 


- 


3D21E80016 
(.03952789) 


4ICA666616 
(25.3) 


- 


3 


- 


- 


3D21E800i6 
(.03952789) 


4ICA666616 
(25.3) 


3F8001D316 
(1.0000556) 


4 


- 


- 


3D21E800i6 
(.03952789) 


4ICA666616 
(25.3) 


3F7FFC5Ai6 
(.99984419) 


5 


- 


- 


3D21E5B116 
(.03952569) 


4ICA666616 
(25.3) 


3D21E5B116 
(.03952569) 


6 


- 


- 


3D21E5B116 
(.03952569) 


4ICA666616 
(25.3) 




3F7FFFFF16 
(.99999994) 


7 


- 


- 


3D21E5B116 
(.03952569) 


4ICA666616 
(25.3) 


3F80000016 
(1.0) 


8 


- 


- 


3D21E5B116 
(.03952569) 


4ICA666616 
(25.3) 


3D21E5B116 
(.03952569) 



Result of first 
iteration 



Result of second 
iteration 
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Example 2: evaluated using ttie procedure described above; 

Find the reciprnrsi of - 4725 '^'^^^' ^^'"®® '°^ ^'^ *'^P ^° 9'"®" '" "'"^'''^ *^^- 

Ttie expected result, to the precision of the float- 
Solution; The IEEE floating-point representation for -.4725 ing-point word, is -2.1 16402io (C0077322i6). In 
is BEF1EB8516- The reciprocal process is begun this case the expected result is produced after the 
by feeding this value to both the seed look-up table first iteration. All subsequent iterations produce the 
and port S. The look-up table produces the value same result, and are therefore unnecessary. 
-2.1162109410 (C007700016). The reciprocal is 




TABLE C5. INPUT BUS AND REGISTER VALUES FOR EXAMPLE 2 




Clock 
Cyde 


R Input 


S Input 


Register R 


Register S 


Register F 


-«- Result Of first 
iteration 

-^- Result of second 
iteration 


1 


C007700016 
(-2.1162109) 


BEF1EB8516 
(-0.4725) 


- 


- 


- 


2 


- 


- 


C007700016 
(-2.1162109) 


BEF1EB8516 
(-0.4725) 


- 


3 


- 


- 


C007700016 
(-2.1162109) 


BEF1EB8516 
(-0.4725) 


3F7FFA1416 
(0.99990963) 


4 


- 


- 


C007700016 
(-2.1162109) 


BEFIFRflfiie 
(-0.4725) 


3F8002F616 
(1.0000904) 


5 


- 


- 


C007732216 
(-2.116402) 


BEF1EB85ie 
(-0.4725) 


C007732216 
(-2.116402) 


6 


- 


- 


C007732216 
(-2.116402) 


BEF1EB8516 
(-0.4725) 


3F80000016 
(1.0) 


7 


- 


- 


C0077322ie 
(-2.116402) 


BEF1EB8516 
(-0.4725) 


3F80000016 
(1.0) 


8 


- 


- 


C007732216 
(-2.116402) 


BEF1EB8516 
(-0.4725) 


C007732216 
(-2.116402) 
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APPENDIX D 

SUMMARY OF FLAG OPERATION 

Tables D1 , D2, and D3 summarize flag operation for the IEEE 
mode, the DEC mode, and for the lEEE-TO-DEC and DEC-TO- 
IEEE operations. 



TABLE D1. FLAG SUMMARY FOR IEEE MODE 



Operation 


Condltlon(s) 


INV 


OVF 


UNF 


INE 


ZER 


NAN 


Any operation 
listed in the 
IEEE Invalid 
Operations Table 




H 


L 


L 


L 


L 


H 


R PLUS S 
R MINUS S 
R TIMES S 
2 MINUS S 


Input operands are finite 
1 rounded result |> 2'' 28 


L 


H 


L 


H 


L 


L 


R PLUS S 
R MINUS S 
R TIMES S 


0<|rounded result] <r''28 


L 


L 


H 


H 


H 


L 


R PLUS S 
R MINUS S 
R TIMES S 
2 MINUS S 
INT-TO-FP 
FP-TO-INT 


Final result does not equal 
infinitely precise result 


L 






H 


* 


L 


R PLUS S 
R MINUS S 
R TIMES S 
2 MINUS S 
INT-TO-FP 
FP-TO-INT 


Final result is zero 


L 


L, 


* 


* 


H 


L 


R PLUS S 
R MINUS S 
R TIMES S 
2 MINUS S 
FP-TO-INT 


Final result is a NAN 


* 


L 


L 


L 


L 


H 



Notes: INV = invalid operation fis 
OVF - Overflow flag 
UNF = Underflow flag 
INE = Inexact flag 
ZER = Zero flag 
NAN = NAN flag 
L = LOW 
H = HIGH 
* = State of flag 
depends on the 
input operands 
and tfie operation 
performed 
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TABLE D2. FLAG SUMMARY FOR DEC MODE 








Operation 


Condltion(8) 


INV 


OVF 


UNF 


INE 


ZER 


NAN 


FP-TO-INT 


Rounded result > 2^^- 1 
or rounded result < -2^'' 


H 


L 


L 


L 


L 


H 


FP-TO-INT 


Input is a DEC-reserved 
operand 


L 


L 


L 


L 


L 


H 


R PLUS S 
















R MINUS S 

R TIMES S 


[Rounded result |> 2^ ^^ 


L 


H 


L 


H 


L 


H 


2 MINUS S 
















R PLUS S 
















R MINUS S 

R TIMES S 


0<|rounded result |< 2-^28 


L 


L 


H 


H 


H 


L 


R PLUS S 
R MINUS S 
R TIMES S 


Final result does not equal 
Infinitely precise result 


L 


* 


* 


H 




* 


2 MIMUS S 
















INT-TO-FP 
















FP-TO-INT 
















R PLUS 8 


Final result is zero 


L 


L 


• 


* 


H 


L 


R MINUS S 
















R TIMES S 
















2 MINUS S 
















INT-TO-FP 
















FP-TO-INT 
















R PLUS S 


Final result is a DEC-reserved 


* 


• 


L 


L 


L 


H 


R MINUS S 
R TIMES S 


operand 














2 MINUS S 
















FP-TO-INT 

















Notes; INV « Invalid operation flag 
OVF - Overflow flag 
UNF - Underflow flag 
INE - Inexact flag 
ZER - Zero flag 
NAN - NAN flag 
L-LOW 



-HIGH 

■ State of flag 
depends on the 
input operands 
and ttie operation 
performed 



TABLE D3. FLAG SUMMARY FOR lEEE-TO-DEC AND DEC-TO-IEEE CONVERSIONS 



Operation 


Condltion(s) 


INV 


OVF 


UNF 


INE 


ZER 


NAN 


IEEE-TO-DEC 


Input is a NAN 


H 


L 


L 


L 


L 


H 


ieee-to-dec 


1 Input] > 2^27 


L 


H 


L 


H 


L 


H 


DEC-TO-IEEE 


Input is a DEC-reserved operand 


H 


L 


L 


L 


L 


H 


DEC-TO-IEEE 


< j rounded result | < 2" ^ ^6 


L 


L 


H 


H 


H 


L 


DEC-TO-IEEE 
IEEE-TO-DEC 


Final result is zero 


L 


L 


* 


* 


H 


L 



Notes: INV - Invalid operation flag 
OVF - Overflow flag 
UNF - Underflow flag 
INE - Inexact flag 
ZER - Zero flag 
NAN = NAN flag 
L - LOW 



■HIGH 

■ State of flag 
depends on the 
input operands 
and the operation 
performed 
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ABSOLUTE MAXIMUM RATINGS 

Storage Temperature -65 to +150°C 


OPERATING RANGES 

Commercial (C) IDevices 




Temperature Under Bias — Tq -55 to +125°C Temperati 

Supply Voltage to Ground Potential Supply Vo 

Continuous -0.5 to +7.0 V ^^ 

DC Voltage Applied to Outputs functionality i 

for HIGH State -0.5 V to +Vcc Max. '"ncaonainy 


re, Case (Tc) to +85°C 


Itage (Vcc) + 4.75 to +5.25 V 

iges define ttwse limits between wtiich the 
If the device is guaranteed. 


DC Input Voltaqe -0.5 to +5.5 V 


DC Output Cun-ent, into Outputs 30 mA 

DC Input Current -30 to +5.0 mA 

Stresses above those listed under ABSOLUTE MAXIMUM 
RATINGS may cause permanent device failure. Functionality 

maximum ratings for extended periods may affect device 
reliability. 


Parameter 
Symbol 


Parameter 
Description 


Test Conditions (Note 1) 


Min. 


Max. 


Units 


VOH 


Output HIGH Voltage 


Vcc = Min- 

V|N = V|L or V|H 

IOH = -1-0 mA 


2.4 




Volts 


Vol 


Ouput LOW Voltage 


Vcc = Min. 

V|N = V|L or V|H 

lOL = 4.0 mA 




0.5 


Volts 


V|H 


Input HIGH Level 


Guaranteed Input Logical 
HIGH Voltage for All Inputs 


2.0 




Volts 


ViL 


Input LOW Level 


Guaranteed Input Logical 
LOW Voltage for All Inputs 




0.8 


Volts 


V| 


Input Clamp Voltage 


Vcc = Min. 
llN = -18 mA 




-1.5 


Volts 


l|L 


Input LOW Current 


Vcc = Max. 
ViN = 0.5 V 


CLK, SI 6/32, OE 
Others 




-1.0 
-0.5 


mA 


l|H 


Input HIGH Current 


Vcc = Max. 
V|M = 2.4 V 


CLK, 816/32, OE 
Others 




100 
50 


)JA 


ll 


Input HIGH Current 


Vcc = Max. 
V|N = 5.5 V 




1 


mA 


lOZH 
lOZL 


F0-F31 Off State (Higti- 
Impedance) Output Cun-ent 


Vcc = Max. 


Vo = 2.4 V 




50 


M 


Vo = 0.5 V 




-50 


isc 


Output Short-Circuit Cunrent 
(Note 2) 


Vcc = Max. +0.5 V 
Vo = 0.5 V 


F0-F31 Outputs 


-15 


-50 


mA 


Flag Outputs 


-15 


-50 


icc 


Power Supply Current 
(Notes 3, 4) 


Vcc = Max. 


COM'L, Tc = +25"'C 


1800 pF Typical 


COM'L Only 


Tc = to +85''C 
Case Temp. 




2114 


mA 


Tc = +85°C 
Case Temp. 




1950 


Notes: 1. For conditions shown as Min. or Max., use the appropriate value specified under Operating Ranges for the applicable device type. 

2. Not more than one output shoud be shorted at a time. Duration of ttie short-circuit test should not exceed one second. 

3. Measured with OE LOW. and with ali output bits (F0-F31 and flag outputs) LOW. 

4. Worst-case Ice applies to cold start at lowest operating temperature. 



SWITCHING CHARACTERISTICS over operating ranges unless othenivise specified 


No. 


Parameter 
Symbol 


Parameter 
Description 


Test 
Conditions 


COM'L (Note 2) 


Units 


Tc = to +85°C Case Temp. 


Am29325 


Am29325A 


Min. 


Max. 


MIn. 


Max. 


1 


tASC 


Clocked Add, Subtract Time (R PLUS S, 
R MINUS S, 2 MINUS S) 






93 




83 


ns 


2 


tMO 


Qocked Multiply Time (R TIMES S) 




93 




83 


ns 


3 


toe 


Qocked Conversion Time (INT-TO-FP, 
FP-TO-INT, lEEE-TO-DEC, DEC-TO-IEEE) 




100 




90 


ns 


4 


tASUC 


Unclooked Add, Subtract Time (R, S to F. 
Flags) for R PLUS S, R MINUS S, 
and 2 MINUS S lnstructk)ns 


FTo - HIGH 
FTi - HIGH 




125 




110 


ns 


5 


tMUC 


Unclooked MuHiply Time (R, S to F, Flags) 
for R TIMES S Instructton 




126 




110 


ns 


6 


tcuc 


Unclooked Conversion Time (R, S to F, 
Flags) for INT-TO-FP, FP-TO-INT, lEEE- 
TO-DEC and DEC-TO-IEEE Instnictions 




125 




110 


ns 


7 


tpWH 


Clock Pulse WMth HIQH 




15 




15 


(Note 3) 


ns 


6 


tPWL 


Clock Pulse Width LOW 


15 




15 


(Note 3) 


ns 


9 


tPDOFI 


Clock to Fq - F31 and Flag Outputs 


FTo - LOW 
FTi - HIGH 




125 




110 


ns 


10 


tpDOF2 


FTi - LOW 




34 




30 


ns 


11 


tpZL 


OE Enable Time 


Z to LOW 






31 




29 


ns 


12 


tpZH 


Z to HIGH 




26 




24 


ns 


13 


tPLZ 


OE Disable Time 


LOW to Z 






31 




31 


ns 


14 


•PHZ 


HIGH to Z 




26 




26 


ns 


15 


•PZL16 


aock t to F0-F15 
Enable, 16-Bit I/O Mode 


Z to LOW 


S16/32-HIGH 
ONEBUS - LOW 




41 




39 


ns 


16 


tPZHlS 


Z to HIGH 




33 




33 


ns 


17 


tPLZ16 


Clock i to F0-F15 
Disable, 16-Bit I/O Mode 


LOW to Z 






26 




26 


ns 


18 


'PHZ16 


HIGH TO Z 




38 




38 


ns 


19 


ipzLie 


Qock i to F16-F31 
Enable, 16-Bit I/O Mode 


Z to LOW 


S16/32 = HIGH 
ONEBUS = LOW 




30 




29 


ns 


20 


tpZHie 


Z to HIGH 




26 




26 


ns 


21 


tpLzie 


Clock t to F16-F31 
Disable, ie-Bit I/O Mode 


LOW to Z 




34 




34 


ns 


22 


tPHZIB 


HIGH to Z 




36 




36 


ns 


23 


tSCE 


Register Clock Enable Setup Time 


FTo - LOW 
FTi - LOW 


6 




6 




ns 


24 


•hce 


Register Clock Enable Hold Time 


FTq = LOW 
FTi - LOW 


1 




1 




ns 


25 


tsoi 


R0-R31, S0-S31 Setup Time (Note 1) 


FTo - LOW 


13 




13 




ns 


26 


tmi 


Ro-R3i. S0-S31 Hokj Time (Note 1) 


6 




6 




ns 


27 


tSD2 


R0-R31. S0-S31 Setup Time (Note 1) 


FTo - HIGH 
FTi -LOW 


104 




104 




ns 


28 


tHD2 


R0-R31. S0-S31 Hold Time (Note 1) 


-5 




-5 




ns 


29 


*SI02 


I0-I2 Instruction Select Setup Time 


FT for Destination 
Register -LOW 


100 




100 




ns 


30 


'HI02 


I0-I2 instructton Select Hold Time 


-5 




-5 




ns 


31 


'PDI02 


lo^iz Instructkm Select to F0-F31, Hags 


FTi - HIGH 




129 




129 


ns 


32 


tSB 


Is Port S Input Select Setup Time 


FTi = LOW 


93 




93 




ns 


33 


Ihis 


I3 Port S Input Select Hold Time 


-5 




-5 




ns 


34 


tSI4 


I4 Register R input Select Setup Time 
(Note 1) 


FTo = LOW 


15 




16 




ns 


35 


tHI4 


I4 Register R Input Select Hold Time 
(Note 1) 












ns 


36 


tSRM 


Round Mode Select Setup Time 


FT for Destination 
Register - LOW 


45 




45 




ns 


37 


tHRM 


Round Mode Select Hold Time 












ns 


38 


tPBF 


Round Mode Select to F0-F31, Flags 


FTi = HIGH 




76 




76 


ns 


Notes: 1. See timing diagram for desired mode of operation to determine clocl< edge to which these setup and hold Umes apply. 

2. It is the responsibility of the user to maintain a case temperature of 85°C or less. AMD recommends an air velocity of at least 200 linear feet per 
minute over the heat sink. 

3. Tester limitations necessitate this spec limit. Typical value shown is actual worst-case value. 
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SWITCHING TEST CIRCUITS 



VOUT 




' R^ = 910 n 



VouT< 



Rj = 6K > ^ Cl 



H4- 



TC001084 



TC001104 



R2 = 



Rl=l0L + 



5.0-Vbe-VoL 

Vol 



2.4 V 
lOH 



IK 



A. Three-State Outputs 



5.0-Vbe-Vol 

Ri = Vol 
B. Normal Outputs 



Notes: 1. Cl = 50 pF includes scope probe, wiring, and stray capacitances without device in test fixture. 

2. Si, S2, S3 are closed during function tests and all AC tests except output enatAe tests. 

3. Si and S3 are closed while S2 is open for tpzH test 
Si and S2 are closed while S3 is open for tpzi. test. 

4. Cl = 5.0 pF for output disable tests. 
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SWITCHING TEST WAVEFORMS 



DATA 

mpur 




\^y 







3 V 



■ S V 

■ 1.6 V 



WFR02970 



Notes; 1. Diagram shown for HIGH data only. 

Output transition may lie opposite sense. 
2. Cross hatched area Is don't care 
condition. 

Set-Up, Hold, and Release Times 



-/=\ 



-Jf=^ 



OPFOSITE PHASE 

I NPyT transition" 



\i=zf 



- J V 

■ 1.8 V 

■ V 

• VOH 
1.SV 
«0L 

. 3 V 

■ 1.6 V 

■ OV 



WFR0Z980 



Propagation Delay 



L0W.HIGH.I.0W 
rULSE 



df=^ 



HIGH.LOWHIGH 
PULSE " 



\z=z:z/ 



WFR02790 



Pulse Width 



'=\ 



Enable 



Disable 



OUTPUT 

NORMALLV 

LOW 



■£ 



• 3 V 

- VS V 

- Q V 



S3 OPEN 



¥ 



■ZL 
-4.6 V 



'L2- 



OUTPUT 
NORMALLY 

"lOH S,OPEN 






0.5 V 

^^ r^OL 



^^-: 



'OH 

,6 V 



Notes on Test Methods 

The following points give tlie general philosophy which we 
apply to tests which must be properly engineered if they are to 
be implemented In an automatic environment. The specifics of 
what philosophies applied to which test are shown. 

1. Ensure that the part Is adequately decoupled at the test 
head. Ijirge changes In supply current when the device 
switches may cause function failures due to Vcc changes. 

2. Do not leave inputs floating during any tests, as they may 
oscillate at higli frequency. 

3. Do not attempt to perfomi threshold tests at high speed. 
Following an input transition, ground current may change by 
as much as 400 mA in 5 to 8 ns. Inductance in the ground 
cable may allow the ground pin at the device to rise by 
hundreds of millivolts momentarily. 

4. Use extreme care in defining input levels for AC tests. Many 
inputs may be changed at once, so there will be significant 
noise at the device pins which may not actually reach V|l or 
V|H until the noise has settled. AMD recommends using 
Vm<0 V and Vih<3 V for AC tests. 

5. To simplify failure analysis, programs should be designed to 
perform DC, Function, and AC tests as three distinct groups 
of tests. 



0.5 V 

WFR02660 
Notes: 1. Diagram siiown for input Control Enable- 
LOW and Input Control DisaHe-HIGH. 
2. S-i, Sz and Sg of Load Circuit ere closed 
except where shown. 

Enable and Disable Times 

6. Capacitative Loading for AC Testing: Automatic testers and 
their associated hardware have stray capacitance which 
varies from one type of tester to another, but generally 
around 50 pF. This, of course, makes it impossible to make 
direct measurements of parameters which call for a smaller 
capacltive load than the associated stray capacitance. 
Typical examples of this are the so-called "float delays," 
which measure the propagation delays in to and out of the 
high-impedance state, and are usually specified at a load 
capacitance of 5.0 pF. In these cases the test is performed 
at the higher load capacitance (typically 50 pF), and 
engineering correlations based on data taken with a bench 
set up are used to predict the result at the lower capaci- 
tance. 

Similarly, a product may be specified at more than one 
capacitive load. Since the typical automatic tester is not 
capable of switching loads in mid-test, it Is impossible to 
make measurements at both capacitances even though 
they may both be greater than the stray capacitance. In 
these cases, a measurement Is made at one of the two 
capacitances. The result at the other capacitance is 
predicted from engineering correlations based on data 
taken with a bench set up and the knowledge that certain 
DC measurements (e.g., Iqh, Iql) have already been taken 
and are within specification. In some oases, special DC 
tests are performed in order to facilitate this conflation. 
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7. Threshold Testing: The noise associated with automatic 
testing, the long, inductive cables, and the high gain of 
bipolar devices when in the vicinity of the actual device 
threshold, frequently give rise to oscillations when testing 
high-speed circuits. These oscillations are not indicative of a 
reject device, but instead, of an overtaxed test system. To 
minimize this problem, thresholds are tested at least once 
for each input pin. Thereafter, "hard" high and low levels 
are used for other tests. Generally this means that function 
and AC testing are performed at "hard" input levels rather 
than at Vil Max. and V|h Min. 

8. AC Testing: Occasionally, parameters are specified which 
cannot be measured directly on automatic testers because 



of tester limitations. Data Input hold times often fall into this 
category, In these cases, the parameter in question is 
guaranteed by correlating tests with other AC tests which 
have been performed. These correlations are arrived at by 
' the cognizant engineer by using data from precise bench 
measurements In conjunction with the knowledge that 
certain DC parameters have already been measured and 
are within specification. 

In some cases, certain AC tests are redundant since they 
can be shown to be predicted by other tests which have 
already been performed. In these cases, the redundant 
tests are not perfomied. 
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Clocked Operation: FTo = LOW 
FTi = LOW 
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SWITCHING WAVEFORMS (Cont'd.) 



Clocked Operation: FTo = HIGH 
FTi = LOW 
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Clocked Operation: FTq = LOW 
FTi = HIGH 



4-74 



SWITCHING WAVEFORMS (Cont'd.1 
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Flow-Through Operation (FTq = HIGH, FTi = HIGH) 



INPUT DATA 
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32-Bit, Single-Input Bus Mode 
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SWITCHING WAVEFORMS (Cont'd] 
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(NOTE 1) 
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Note 1. I4 has special setup and hold time requirements in this mode. All other control signals have timing 
requirements as shown in the diagram "Clocked operation, FTq = LOW, FTi = LOW." 

16-Bit, Two-Input Bus Mode 



4-76 



OUTPUT ENABLE/DISABLE TIMING 



DRIVEN INPUT 



^<^- 



Vcc 





CLK, 16732, 5e 

R-8KS2 

ALL OTHER INPUTS 

R-16KfJ 



Ci— 5.0 pF, all inputs 



Co— 5.0 pF, all outputs 
Note: Actual current flow direction shown. 
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Am29C325 

CMOS 32-Bit Floating-Point Processor 



n 



ADVANCE INFORMATION 



DISTINCTIVE CHARACTERISTICS 



Single VLSI device performs higli-speed floating-point 
arithmetic 

- Floating-point addition, subtraction, and multiplication 
in a single clock cycle 

- Internal architecture supports sum-of-products, 
Newton- Raphson division 

32-bit, three-bus flow-through architecture 

- Programmable I/O allows interface to 32- and 16-bit 
systems 



IEEE and DEC formats 

- Performs conversions between formats 

- Performs integer •^^ floating-point conversions 
Input and output registers can be made transparent 
independently 

Pin and functionally compatible with the Bipolar 

Am29325 

The Am29C325 uses less than one-quarter the power of 

the Am29325 

145 PGA requires no heatsinl< 



GENERAL DESCRIPTION 



The Am29C325 is a high-speed floating-point processor 
unit. It performs 32-bit single-precision floating-point addi- 
tion, subtraction, and multiplication operations in a single 
VLSI circuit, using the format specified by the proposed 
IEEE floating-point standard, 754. The DEC single-preci- 
sion floating-point format is also supported. Operations for 
conversion between 32-bit integer format and floating-point 
format are available, as are operations for converting 
between the IEEE and DEC floating-point formats. Any 
operation can be performed in a single clock cycle. Six 
flags — invalid operation, inexact result, zero, not-a-num- 
ber, overflow, and underflow — monitor the status of opera- 
tions. 

The Am29C325 has a three-bus, 32-bit architecture, with 
two input buses and one output bus. This configuration 



provides high I/O bandwidth, allows access to all buses, 
and affords a high degree of flexibility when connecting this 
device in a system. All buses are registered, with each 
register having a clock enable. Input and output registers 
may be made transparent independently. Two other I/O 
configurations, a 32-biti two-b)us architecture and a 16-bit, 
three-bus architecture, are user-selectable, easing inter- 
face with a wide variety of systems. Thirty-two-bit internal 
feedforward datapaths support accumulation operations, 
including sum-of-products and Newton-Raphson division. 

Fabricated using Advanced Micro Devices' 1.2 micron 
CMOS process, the Am29C325 is powered by a single 5- 
volt supply. The device is housed in a 145-lead pin-grid- 
array package. 



Am29C300 FAMILY HIGH-PERFORMANCE SYSTEM BLOCK DIAGRAM 
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This document contains information on a prociuct under development at Advanced Micro 
Devices, Inc. The intormalion is intended to help you to evaluate ttiis product. AMD 
reserves the right to change or discontinue work on this product without rrotice. 
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Publication # Rav. Amendment 

07783 B /O 
Issue Date: September 1987 



RELATED AMD PRODUCTS 



Part No. 


Description 


Am29114 


Vectored Priority Interrupt Controller 


Ani29116 


High-Performance Bipolar 16-Bit Microprocessor 


Am29C116 


High-Performance CMOS 16-Bit Microprocessor 


Am29PL141 


Fuse Programmable Controller 


Am29C323 


CMOS 32-Bit Parallel Multiplier 


Am29331 


16-Bit Microprogram Sequencer 


Am29C331 


CMOS 16-Bit Microprogram Sequencer 


Am29332 


32-Bit Extended Function ALU 


Am29C332 


CMOS 32-Bit Extended Function ALU 


Am29334 


64x18 Four-Port, Dual-Access Register File 


Am29C334 


CMOS 64x18 Four-Port, Dual-Access Register File 


Am29337 


16-Bit Bounds Checker 


Am29338 


Byte Queue 



BLOCK DIAGRAM 



clkO— /- 

SELECT 16 

AND ENABLE O / 
LINES 



2 :1 
MUX 






REGISTER 
S 



REGISTER 
R 



2 :1 
MUX 



FLOATING-POINT 
ALU 



STATUS 

FLAG 

GENERATOR 



STATUS FLAG 
REGISTER 



oeO- 
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CONNECTION DIAGRAM 
Bottom View 

PGA 
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R 




INEX 


12 


11 


ENF 


14 


OBUS 


rse 


voo 


CLK 


R31 


B30 


R25 


R24 


H21 


R20 
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R2e 
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S24 




F7 
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GND 
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GND 


GND 


SB 
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S14 
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S22 


S23 




F8 


F3 


F2 


GND 


FO 


Si 


S2 


GND 


S4 


S9 


S10 


Sis 
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S21 
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F5 


F4 


F1 
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P/AFF 


SO 
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Key: 16/32 =■ S1 6/32 

l/D =■ IEEE/DEC 
INEX = INEXACT 
INVA - INVALID 
OBUS - ONEBUS 
OVFL - OVERFLOW 
P/AFF = PROJ/AFF 
UNFL = UNDERFLOW 

*D4 is an alignment pin (not connected Internally). 
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PIN DESIGNATIONS 

(Sorted by Pin No.) 


PIN NO. 


PIN NAME 


PIN NO. 


PIN NAME 


PIN NO. 


PIN NAME 


PIN NO. 


PIN NAME 


A-1 


Inexact 


C-7 


F25 


H-13 


GND 


N-10 


S28 


A-2 


Invalid 


C-8 


Vcc 


H-14 


GND 


N-11 


S27 


A-3 


F29 


C-9 


vcc 


H-15 


S5 


N-12 


Vcc 


A-4 


F30 


C-10 


F16 


J-1 


CLK 


N-13 


Vcc 


A-5 


F23 


C-11 


F11 


J-2 


RNDo 


N-14 


S18 


A-6 


F26 


C-1 2 


F10 


J-3 


Vcc 


N-15 


Sl7 


A-7 


F21 


C-13 


GND 


J-1 3 


GND 


P-1 


R21 


A-8 


F22 


C-1 4 


F2 


J-14 


S4 


P-2 


R22 


A-9 


F17 


C-1 5 


Fl 


J-1 5 


S7 


P-3 


Rl9 


A-10 


F18 


D-1 


ENF 


K-1 


R31 


P-4 


R16 


A-11 


F13 


D-2 


IEEE/DEC 


K-2 


RNDi 


P-5 


R1I 


A-12 


F12 


D-3 


ENR 


K-3 


Ft29 


P-6 


R10 


A-13 


F7 


D-1 3 


GND 


K-13 


Ss 


P-7 


Rs 


A-14 


Fs 


D-1 4 


GND 


K-14 


Sg 


p-e 


R4 


A-15 


Fs 


D-1 5 


GND 


K-15 


Se 


P-9 


I3 


B-1 


i2 


E-1 


I4 


L-1 


FiSO 


P-10 


S31 


B-2 


NAN 


E-2 


FTo 


L-2 


R27 


p-11 


S26 


B-3 


ZERO 


E-3 


ENS 


L-3 


R26 


P-12 


S25 


B-4 


F31 


E-13 


GND 


L-13 


S13 


P-13 


S22 


B-5 


OVERFLOW 


E-14 


Fo 


L-1 4 


S10 


P-14 


S21 


B-6 


F27 


E-16 


PROJ/AFF 


L-15 


S11 


P-15 


S16 


B-7 


F24 


F-1 


ONEBUS 


M-1 


R25 


R-1 


R20 


B-S 


F19 


F-2 


FTi 


M-2 


R28 


R-2 


R17 


B-9 


F20 


F-3 


SI 6/32 


M-3 


GND 


R-3 


R18 


B-10 


F15 


F-13 


GND 


M-13 


S14 


R-4 


R13 


B-11 


F14 


F-14 


Sl 


M-1 4 


S15 


R-5 


R12 


B-1 2 


F9 


F-15 


So 


M-1 5 


S12 


R-6 


R? 


B-13 


Fe 


G-1 


OE 


N-1 


R24 


R-7 


Re 


B-1 4 


F3 


G-2 


Vcc 


N-2 


R23 


R-8 


Ri 


B-15 


F4 


G-3 


Vcc 


N-3 


GND 


R-9 


R2 


C-1 


I1 


G-1 3 


GND 


N-4 


R15 


R-10 


Sso 


C-2 
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GND 
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C-4 


GND 
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Ro 
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Si9 
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PIN DESIGNATIONS (Cont'd.) 
(Sorted by Pin Name) 


PIN NO. 


PIN NAME 


PIN NO. 


PIN NAME. 


PIN NO. 


PIN NAME 


PIN NO. 


PIN NAME. 


J-1 


CLK 


E-2 


FTo 


R-6 


R7 


K-1 4 


S9 


D-1 


ENF 


F-2 


FTi 


N-7 


Ra 


L-1 4 


S10 


D-3 


ENR 


N-3 


GND 


N-8 


R9 


L-1 5 


S11 


E-3 


ENS 


H-14 


GND 


P-6 


Rio 


M-1 5 


S12 


E-14 


Fo 


G-13 


GND 


P-5 


R11 


L-1 3 


Sl3 


C-15 


Fl 


M-3 


GND 


R-5 


Rl2 


M-13 


Sl4 


C-14 


F2 


H-13 


GND 


R-4 


Rl3 


M-14 


Sis 


B-14 


F3 


J-1 3 


GND 


N-5 


Rl4 


P-1 5 


S16 


B-15 


F4 


D-1 5 


GND 


N-4 


R15 


F-3 


S16/32 


A-15 


Fs 


D-14 


GND 


P-4 


R,6 


N-1 5 


Sl7 


B-13 


F6 


E-13 


GND 


R-2 


ni7 


N-1 4 


S18 


A-13 


F7 


F-13 


GND 


R-3 


R18 


R-15 


Sl9 


A-14 


Fe 


C-4 


GND 


P-3 


Rl9 


R-1 4 


S2O 


B-12 


Fg 


C-3 


GND 


R-1 


R20 


P-1 4 


S21 


C-12 


Fio 


D-13 


GND 


P-1 


R21 


P-1 3 


S22 


C-11 


F11 


C-13 


GND 


P-2 


R22 


R-13 


S23 


A-12 


F12 


C-2 


lo 


N-2 


R23 


R-12 


S24 


A-11 


Fl3 


C-1 


I1 


N-1 


R24 


P-1 2 


S25 


B-11 


Fl4 


B-1 


I2 


M-1 


R25 


P-11 


S26 


B-10 


Fl5 


P-9 


I3 


L-3 


R26 


N-11 


S27 


C-10 


F16 


E-1 


I4 


L-2 


R27 


N-10 


S28 


A-9 


F17 


D-2 


IEEE/DEC 


M-2 


R28 


R-11 


S29 


A-10 


F18 


A-1 


INEXACT 


K-3 


R29 


R-10 


S30 


B-8 


Fl9 


A-2 


INVALID 


L-1 


R30 


P-10 


S3I 


B-9 


Fm 


B-2 


NAN 


K-1 


R3I 


C-5 


UNDERFLOW 


A-7 


Fzi 


G-1 


Of 


J-2 


RNDq 


J-3 


Vcc 


A-8 


F22 


F-1 


ONEBUS 


K-2 


RNDi 


G-2 


Vcc 


A-5 


F23 


B-5 


OVERFLOW 


F-1 5 


So 


G-3 


Vcc 


B-7 


F24 


E-1 5 


PROJ/AFF 


F-1 4 


Si 


H-2 


Vcc 


C-7 


F25 


N-9 


Ro 


G-1 4 


S2 


N-13 


Vcc 


A-6 


F26 


R-8 


Ri 


G-1 5 


S3 


N-1 2 


Vcc 


B-6 


F27 


R-9 


R2 


J-14 


S4 


H-3 


Vcc 


C-6 


F28 


N-8 


Ra 


H-15 


S5 


H-1 


Vcc 


A-3 


F29 


P-8 


R4 


K-1 5 


Se 


C-8 


Vcc 


A-4 


F30 


P-7 


Rs 


J-1 5 


S7 


C-9 


Vcc 


B-4 


F3I 


R-7 


Re 


K-1 3 


S8 


B-3 


ZERO 
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LOGIC SYMBOL 






^7^ 

i^ 



^ 



CLK 
ENR 
ENS 
ENF 

FTO'FTl 

I0-I4 
IEEE/DEC 

OE 

ONEBUS 

PROJ/AFF 

RNDq.RNDi 

S16/55 



INEXACT 

INVALID 

NAN 

OVERFLOW 

UNDERFLOW 

ZERO 



5C> 
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ORDERING INFORMATION 

Standard Products 



AMD standard products are available in several packages and operating ranges. The order number (Valid Combination) is 
formed by a combination of: a. Device Number 

b. Speed Option (if applicable) 

c. Pacl<age Type 

d. Temperature Range 

e. Optional Processing 



Q S 


' t 







DEVICE NUMBER/DESCRIPTION 

Am29C325 

CK/IOS 32-Bit Floating-Point Processor 



Valid Combinations 


Am29C326 


GC, GCB 


AM29C325-1 



e. OPTIONAL PROCESSING 

Blank - Standard processing 
B = Bum-in 

d. TEMPERATURE RANGE 

C - Commercial (0 to + 85°C) Case 

C. PACKAGE TYPE 

G- 145-Lead Pin Grid Array without Heatsink 
{CGX145) 

- b. SPEED OPTION 

- 1 = Speed Select 



Valid Combinations 

Valid Combinations list configurations planned to be 
supported in volume for this device. Consult the local AMD 
sales office to confirm availability of specific valid 
combinations, to check on newly released combinations, and 
to obtain additional data on AMD's standard military grade 
products. 
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MILITARY ORDERING INFORMATION 
APL Products 



AMD products for Aerospace and Defense applications are available in several packages and operating ranges. APL (Approved 
Products List) products are fully compliant with MIL-STD-883C requirements. The order number (Valid Combination) for APL 
products is formed by a combination of: a. Device Number 

b. Speed Option (if applicable) 

c. Device Class 

d. Pacicage Type 

e. Lead Finisti 

AM29C325 



B Z C 



e. LEAD FINISH 

C = Gold 



d. PACKAGE TYPE 

Z-146-Lead Pin Grid Array without Heatsink 
(CGX145) 



DEVICE NUMBER/DESCRIPTION 

Am29C325 

CMOS 32-Bit Floating-Point Processor 



Vaiid Combinations 



/BZC 



c. DEVICE CLASS 

/S- Class B 



b. SPEED OPTION 

Not Applicable 



Valid Combinations 

Valid Combinations list configurations planned to be 
supported in volume for this device. Consult the local AMD 
sales office to confirm availability of specific valid 
combinations or to check for newly released valid 
combinations. 

Group A Tests 

Group A tests consist of Subgroups 

1, 2, 3, 7, 8, 9, 10, 11. 
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PIN DESCRIPTION 



CLK Clock (Input) 

For the internal registers. 



ENF Regis ter F Clock Enable (Input; Active LOW) 

When ENF is LOW, register F is clocked on the LOW-to- 
HIGH transition of CLK. When ENF is HIGH, register F 
retains the previous contents. 



ENR Regis ter R Clock Enable (Input; Active LOW) 

When ENR is LOW, register R i s cloc ked on the LOW-to- 
HIGH transition of CLK. When ENR is HIGH, register R 
retains the previous contents. 



ENS Register S Clock Enable (Input; Active LOW) 

When ENS is LOW, register S i s cloc ked on the LOW-to- 
HIGH transition of CLK. When ENS is HIGH, register S 
retains the previous contents. 

F0-F31 F Operand Bus (Output) 

Fo is the least-significant bit. 

FTo Input Register Feedthrough Control (Input; 
Active HIGH) 

When FTo is HIGH, registers R and S are transparent. 

FTi Output Register Feedthrougti Control (Input; 
Active HIGH) 

When FTi is HIGH, register F and the status flag register 
are transparent. 

I0-I2 Operation Select Lines (Input) 

Used to select the operation to be performed by the ALU. 
See Table 1 for a list of operations and the corresponding 
codes. 

I3 ALU S Port Input Select (Input) 

A LOW on I3 selects register S as the input to the ALU S 
port. A HIGH on I3 selects register F as the input to the ALU 
S port. 

U Register R Input Select (Input) 

A LOW on I4 selects Rq - R31 as the input to register R. A 
HIGH selects the ALU F port as the input to register R. 

IEEE/DEC IEEE/DEC Mode Select (input) 

When IEEE/DEC is HIGH, IEEE mode is selected. When 
IEEE/DEC is LOW, DEC mode is selected. 

INEXACT Inexact Result Flag (Output; Active HIGH) 

A HIGH indicates that the final result of the last operation 
was not infinitely precise, due to rounding. 

INVALID Invalid Operation Flag (Output; Active 
HIGH) 

A HIGH indicates that the last operation performed was 
invalid; e.g., «> times 0. 



NAN Not-a-Number Flag (Output; Active HIGH) 

A HIGH indicates that the final result produced by the last 
operation is not to be interpreted as a number. The output in 
such cases is either an IEEE Not-a-Number (NAN) or a 
DEC-reserved operand. 



OE Output Enable (input; Active LOW) 

When OE is LOW, the contents of register F are placed on 
F0-F31. When OE is HIGH, F0-F31 assume a high- 
impedance state. 

ONEBUS Input Bus Configuration Control (Input) 

A LOW on ONEBUS configures the input bus circuitry for 
tvi/o-input bus operation. A HIGH on ONEBUS configures 
the input bus circuitry for single-input bus operation. 

OVERFLOW Overflow Flag (Output; Active HIGH) 

A HIGH indicates that the last operation produced a final 
result that overflowed the floating-point format. 



PROJ/AFF Projective/Affine Mode Select (Input) 

Choice of projective or affine mode determines the way in 
which infin ities are handled in IEEE mode. A LOW on 
PROJ/AFF selects affine mode; a HIGH selects projective 
mode. 

R0-R31 R Operand Bus (Input) 

Ro is the least-significant bit. 

RNDq, RNDi Rounding Mode Selects (input) 

RNDo and RNDi select one of four rounding modes. See 
Table 5 for a list of rounding modes and the corresponding 
control codes. 

S0-S31 S Operand Bus (input) 

So is the least-significant bit. 

S16/32 16- or 32-Bit I/O Mode Select (input) 

A LOW on S16/32 selects the 32-bit I/O mode; a HIGH 
selects the 16-bit I/O mode. In 32-bit mode, input and 
output buses are 32 bits wide. In 16-bit mode, input and 
output buses are 16 bits wide, with the least- and most- 
significant portions of the 32-bit input and output words 
being placed on the buses during the HIGH and LOW 
portions of CLK, respectively. 

UNDERFLOW Underflow Flag (Output; Active HIGH) 

A HIGH indicates that the last operatron produced a 
rounded result that underflowed the floating-point format. 

ZERO Zero Flag (Output; Active HIGH) 

A HIGH indicates that the last operation produced a final 
result of zero. 



Definition of Terms 

Affine Mode 

One of two modes affecting the handling of operations on 
infinities — see the Operations with Infinities section under 
Operations in IEEE Mode. 

Biased Exponent 

The true exponent of a floating-point number, plus a constant. 
For IEEE floating-point numbers, the constant is 127; for DEC 
floating-point numbers, the constant is 128. See also True 
Exponent. 

Bus 

Data input or output channel for the floating-point processor. 



DEC-Reserved Operand 

A DEC floating-point number that is interpreted as a symbol 
and has no numeric value. A DEC-reserved operand has a 
sign of 1 and a biased exponent of 0. 

Destination Format 

The format of the final result produced by the floating-point 
ALU. The destination format can be IEEE floating point, DEC 
floating point, or integer. 

Final Result 

The result produced by the floating-point ALU. 

Fraction 

The 23 least-significant bits of the mantissa. 
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Infinitely Precise Result 

The result ttiat would be obtained from an operation if both 
exponent range and precision were unbounded. 

Input Operands 

The value or values on which an operation is performed. For 
example, the addition 2 + 3 = 5 has input operands 2 and 3. 

Mantissa 

The portion of a floating-point number containing the number's 
significant bits. For the floating-point number 1.101 x 2"^, the 
mantissa is 1.101. 

NAN (Not-a-Number) 

An IEEE floating-point number that is interpreted as a symbol, 
and has no numeric value. A NAN has a biased exponent of 
255io and a non-zero fraction. 

Port 

Data input or output channel for the floating-point ALU. 

Projective Mode 

One of two modes affecting the handling of operations on 
infinities — see the Operations with Infinities section under 
Operation in IEEE Mode. 

Rounded Result 



reserved operand. The output of this last stage appears on 
port F, and is called the final result 



OPEr?.AND R 



OPERATION STAGE 
(PERFORMS SELECTED OPERATION) 



- INFINITELY PRECISE RESULT 



ROUNDING STAGE 

(ROUNDS INFINrfELY PRECISE 

RESULT) 



- ROUNDED RESULT 



EXCEPTION STAGE 
(CHECKS FOR UNUSUAL CONDITIONS) 



FINAL RESULT 



The result produced by rounding the infinitely precise result to 
fit the destination format. 

True Exponent (or Exponent) 

Number representing the power of two by which a floating- 
point number's mantissa is to be multiplied. For the floating- 
point number 1.101 x 2"'', the true exponent is -3. 

FUNCTIONAL DESCRIPTION 
Architecture 

The Am29C325 comprises a high-speed, floating-point ALU, a 
status flag generator, and a 32-bit data path. 

Floating-Point ALU 

The floating-point ALU performs 32-bit floating-point opera- 
tions. It also performs floating-point-to-integer conversions, 
integer-to-floating-point floating-point conversions, and con- 
versions between the IEEE and DEC formats. The ALU has 
two 32-bil input ports, R and S, and a 32-bit output port, F. 

Conceptually, the process performed by the ALU can be 
divided into three stages (see Figure 1). The operation stage 
performs the arithmetic operation selected by the user; the 
output of this section is referred to as the infinitely precise 
result of the operation. The rounding stage rounds the 
infinitely precise result to fit in the destination format; the 
output of this stage is called the rounded result The last stage 
checks for exceptional conditions. If no exceptional condition 
is found, the rounded result is passed through this stage. If 
some exceptional condition is found (e.g., overflow, underflow, 
or an invalid operation), this section may replace the rounded 
result with another output such as -^ °°, -■», a NAN, or a DEC- 



Figure 1. Conceptual Model of the Process 
Performed by the Floating-Point ALU 

The ALU performs one of eight operations; the operation to be 
performed is selected by placing the appropriate control code 
on lines Iq - 12. Table 1 gives the control codes corresponding 
to each of the eight operations. 

The floating-point addition operation (R PLUS S) adds the 
floating-point numbers on ports R and S, and places the 
float ing-point result on port F. In IEEE mode (IEEE/ 
DEC = HIGH) the addition is p erform ed in IEEE floating-point 
format; in DEC mode (IEEE/DEC = LOW) the addition is 
performed in DEC format. 

The floating-point subtraction operation (R MINUS S) sub- 
tracts the floating-point number on port S from the floating- 
point number on port R and places the floating-point result on 
port F. In IEEE mode (IEEE/DEC = HIGH) the subtraction is 
perfor med i n IEEE floating-point point format; in DEC mode 
(IEEE/DEC = LOW) the subtraction is performed in DEC 
format 

The floating-point multiplication operation (R TIMES S) multi- 
plies the floating-point numbers on ports R and S, and places 
the floating-point result on port F. In IEEE mode (IEEE/ 
DEC = HIGH) the multiplication is performed in IEEE floating- 
point format; in DEC mode (IEEE/DEC = LOW) the multiplica- 
tion is performed in DEC format 

The floating-point constant subtraction (2 MINUS S) operation 
subtracts the floating-point value on port S from 2, and places 
the result on port F. The operand on port R is not used in this 
operation; its valu e will not affect the operation in any way. In 
IEEE mode (IEEE/DEC = HIGH) the operation is perfo rmed in 
IEEE floating-point format; in DEC mode (IEEE/DEC = LOW) 
the operation is performed in DEC format. This operation is 



4-87 



used to support Newton-Raphson floating-point division: a 
description of its use appears in Appendix C. 

The Integer-to-floating-point conversion (INT-TO-FP) opera- 
tion takes a 32-bit, two's-complement integer on port R and 
places the equivalent floating-point value on port F. The 



operand on port S is not used In this operation; its value will 
not a ffect the operation in any way. In IEEE nnode (IEEE/ 
DEC = HIGH) the result is delivered in IEEE format; in DEC 
mode (lEEE/BEC " LOW) the result is delivered in DEC 
format. 



TABLE 1. ALU OPERATION SELECT 



l2 


l1 


■o 


Operation 


Output Equation 











Floating-point addition (R PLUS S) 


F-R-l-S 








1 


Floating-point subtraction (R MINUS S) 


F = R-S 





1 





Floating-point multiplication (R TIMES S) 


F-R*S 





1 


1 


Floating-point constant subtraction 
(2 MINUS S) 


F = 2-S 


1 








Integer-to-floating-point conversion 
(INT-TO-FP) 


F (floating-point) = R (integer) 


1 





1 


Floating-point-to-integer conversion 
(FP-TO-INT) 


F (integer) = R (floating-point) 


1 


1 





lEEE-TO-DEC format conversion 
(lEEE-TO-DEC) 


F (DEC format) = R (IEEE format) 


1 


1 


1 


DEC-TO-IEEE format conversion 
(DEC-TO-IEEE) 


F (IEEE format) = R (DEC format) 



The floating-point-to-integer conversion (FP-TO-INT) opera- 
tion takes a floating-point number on port R and places the 
equivalent 32-bit, two's-complement integer value on port F. 
The operand on port S is not used in this operation; its value 
will n ot affect the operation in any way. In IEEE mode (IEEE/ 
DEC = HIGH) the operand on port R is interpre ted u sing the 
IEEE floating-point format; in DEC mode (IEEE/DE5 = LOW) 
it is Interpreted using the DEC floating-point format. 

The lEEE-to-DEC conversion operation (lEEE-TO-DEC) takes 
an IEEE-format floating-point number on port R and places the 
equivalent DEC-format floating-point number on port F. The 
operand on port S is not used in this operation; its value will 
not affect the operation in any way. The operation can be 
performed in either IEEE mode (lEEE/BEC = HIGH) or DEC 
mode (IEEE/DEC = LOW). 

The DEC-to-IEEE conversion operation (DEC-TO-IEEE) takes 
a DEC-format floating-point number on port R and places the 
equivalent IEEE-floating-point number on port F. The operand 
on port S is not used in this operation; its value will not affect 
the operation in any way. The operation can be performed in 
either IEEE mode (IEEE/DES = HIGH) or DEC mode (IEEE/ 
DEC - LOW). 

Status Flag Generator 

The status flag generator controls the state of six flags that 
report the status of floating-point ALU operations. The flags 
Indicate when an operation is invalid (e.g., " times 0) or when 
an operation has produced an overflow, an underflow, a non- 
numerical result (e.g., a NAN- or DEC-reserved operand), an 
inexact result, or a result of zero. The flags represent the 
status of the most recently performed operation. Flag status is 
stored in the flag status register on the LOW-to-HIGH transi- 
tion of CLK. When the output register feedthrough control FTi 
is HIGH, the flag status register is made transparent. 



Data Path 

The 32-bit data path consists of the R and S input buses; the F 
output bus; data registers R, S, and F; the register R input 
multiplexer; and the ALU port S input multiplexer. 

Input operands enter the floating-point processor through the 
32-bit R and S input buses, Ro - R31 and So - S31 . Results of 
operations appear on the 32-bit F bus, F0-F31. The F bus 
assumes a high-impedance state when output enable OE Is 
HIGH. 

The R and S registers store input operands; the F register 
stores the final result of the floating-point ALU oper a tion. Each 
register has an independent clock enable (ENR, ENS, and 
ENF). When a register's clock enable is LOW, the register 
stores the data on its Input at the LOW-to-HIGH transition of 
CLK; when the clock enable is HIGH, the register retains its 
current data. All data registers are fully edge-triggered — both 
the input data and the register enable need only meet modest 
setup and hold time requirements. Registers R and S can be 
made transparent by setting FTq, the input register feed- 
through control, HIGH. Register F can be made transparent by 
setting FTi, the output register feedthrough control, HIGH. 

The register R input multiplexer selects either the R input bus 
or the floating-point ALU's F port as the input to register R. 
Selection is controlled by I4 — a LOW selects the R input bus; 
a HIGH selects the ALU F port. The ALU port S input 
multiplexer selects either register S or register F as the input to 
the floating-point ALU's S port. Selection is controlled by I3 — 
a LOW selects register S; a HIGH selects register F. 

Data selected by I3 and I4 is described in Table 2. When 
registers R and S are transparent (FTq = HIGH), multiplexer 
select I4 must be kept LOW, so that the register R input 
multiplexer selects Rq - R31 . When register F is transparent 
(FTi = HIGH), multiplexer select Is must be kept LOW, so that 
the ALU port S input multiplexer selects register S. 
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TABLE 2. MUX SELECT 



TABLE 3. I/O MODE SELECTION 



i3 


Data selected for fioating-point ALU 


S port 





Register S 


1 


Register F 


U 


Data selected for register R input 





R bus 


1 


Floating-point ALU port F 



I/O Modes 

The Am29C325 datapath can be configured in one of ttiree 1/ 
O modes: a 32-bit, two-input bus mode; a 32-bit, single-input 
bus mode; and a 1 6- bit, two-input bus mode. These modes 
affect only the manner In which data is delivered to and taken 
from the Am29C325; operation of the floating-point ALU Is not 
altered. The I/O mode Is selected with the ONEBUS and S16/ 
32 controls. Table 3 lists the control codes needed to invol<e 
each i/0 mode. 



S16/32 


ONEBUS 


I/O Mode 





1 
1 ' 




1 


1 


32-bit, two-input-bus mode 
32-bit, single-input-bus mode( * ) 
16-bit, two-input-bus mode(*) 
illegal I/O mode selection value 



*FTo must be held LOW in this mode (see text). 

32-Blt, Two-Input Bus Mode 

In this I/O mode, the R and S buses are configured as 
independent 32-bit input buses, and the F bus is configured as 
a 32-bit output bus. Figure 2 is a functional block diagram of 
the Am29C325 in this I/O mode. 

R and S operands are taken from their respective input buses 
and clocked into the R and S registers on the LOW-to-HIGH 
transition of CLK. Register F is also clocked on the LOW-to- 
HIGH transition of CLK. Figure 5(a) depicts typical I/O t'ming 
In this mode. 



RBUS/^ — ^^ 
SBUS4, ^V 



ENR 
CLK 



^ 



1=^4^ 



'/ 



ONEBUS ( = LOW) I > — 7 
S1 6/32 ( = LOW) r~">— V- 



FC^-IA 



oiO-V- 



FBUS/^/ -/- 



1 \ 



RO-R3I 



/ So-S3T 



-A 



t- > 



Z 



1 r 



R S 

FLOATING-POINT 

ALU 

F 



-^ 



/ djENS 



4^-<=ll3 



/ F(l-F31 



-V 



Figure 2. Functional Block Diagram for the 32-Blt, Two-Input Bus Mode 
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32-Bit, Single-Input Bus Mode 



In this I/O mode, the R and S buses are connected to a single 
32-bit multiplexed input data bus; the F bus is configured as an 
independent 32-bit output bus. Figure 3 is a functional block 
diagram of the Am29C325 in this I/O mode. Note that both the 
R and S bus lines must be wired to the input bus. 

R and S operands are multiplexed onto the input bus by the 
host system. The S operand is clocked from the input bus into 
a temporary holding register on the HIGH-to-LOW transition of 
CLK and is transferred to register S on the LOW-to-HIGH 



transition of CLK. The R operand is clocked from the input bus 
into register R on the LOW-to-HIGH transition of CLK. Register 
F is clocked on the LOW-to-HIGH transition of CLK. Figure 
5(b) depicts typical I/O timing in this mode. 

When placed in this I/O mode, the data path will not function 
properly if the R and S registers are made transparent. 
Therefore, input register feedthrough control FTp must be held 
LOW in this mode. 



R/SBUS /[/- 



UO—/- 



ENR I > — /— 

1, 

CLK I > / 



S16,'32 (^LOWI I > — yl— 



ENf O—/— 



0EC3— A 



FBUS /[, -f- 



1 \ 



'S0-S3, 



^ 



S 2' 
MUX 



, >lf(" 



\ 



F 



21 s 

MUX 



1 r 



FLOATING-POINT 

ALU 



J 



-7/ < I ENS 



-T*^— C3I3 



> f0-f31 



-^ 



Figure 3. Functional Btocl( Diagram for tlie 32-Bit, Single-Input Bus Mode 
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16-Bit, Two-Input Bus Mode 



in this i/0 mode, the R and S buses are configured as 
independent 1 6-bit input buses, and the F bus is configured as 
a 1 6-bit output bus. Figure 4 is a functional bloci< diagram of 
the Am29C325 in this I/O mode. Note that the 16 ieast- 
significant bits (LSBs) and 16 most-significant bits (MSBs) of 
the R, S, and F buses must be wired to their respective system 
buses in parallel. 

Thirty-two-bit operands are passed along the 16-bit data 
buses by time-muitipiexing the 16 LSBs and 16 MSBs of each 
32-bit word. For the R input bus, the host system multiplexes 
the 16 LSBs and 16 MSBs of the R operand onto the 16-bit R 
bus. The 16 LSBs of the R operand are stored in a temporary 
holding register on the HIGH-to-LOW transition of CLK. The 16 
MSBs are clocl<ed into register R on the LOW-to-HIGH 
transition of CLK; at the same time, the 16 LSBs are 
transferred from the temporary holding register to register R. 
Transfer of data from the S input bus to the S register takes 
place in a similar fashion. Register F is clocked on the LOW- 
to-HIGH transition of CLK. Circuitry internal to the Am29C325 
multiplexes data from register F onto the 1 6-bit output bus by 
enabling the 16 LSBs of the F output bus when CLK is HIGH, 
and enabling the 1 6 MSBs of the F output bus when CLK is 
LOW. Figure 5(c) depicts typical I/O timing in this mode. 

When placed in this I/O mode, the data path will not function 
properly if the R and S registers are made transparent. 
Therefore, input register feedthrough control FTq must be held 
LOW in this mode. Caution must also be taken in controlling 
the register R input multiplexer control line, I4, in this I/O 
mode. I4 should be changed only when CLK is HIGH, in 



addition to meeting the setup and hold time requirements 
given in the Switching Characteristics section. 

Operation in IEEE Mode 

When input signal IEEE/DEC is HIGH, the IEEE mode of 
operation is selected. In this mode the Am29C325 uses the 
floating-point format set forth in the IEEE Proposed Standard 
for Binary Floating-Point Arithmetic, P754. In addition, the 
IEEE mode complies with most other aspects of single- 
precision floating-point operation outlined in the proposed 
standard — differences are discussed in Appendix A. 

IEEE Floating-Point Format 

The IEEE single-precision floating-point word is 32 bits wide, 
and is an^anged in the format shown in Figure 6. The floating- 
point word is divided into three fields: a single-bit sign, an 8-bit 
biased exponent, and a 23-bit fraction. 

The sign bit indicates the sign of the floating-point number's 
value. Non-negative values have a sign of 0; negative values, 
a sign of 1 . The value zero may have either sign. 

The biased exponent is an 8-bit unsigned integer field repre- 
senting a multiplicative factor of some power of two. The bias 
value is 127. If, for example, the multiplicative factor for a 
floating-point numt)er is to be 2^, the value of the biased 
exponent would be a +127; "a" is called the true exponent. 

The fraction is a 23-bit unsigned fraction field containing the 
23 LSBs of the floating-point number's 24-bit mantissa. The 
weight of fraction's MSB is 2"''; the weight of the LSB is 2"^^. 
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Figure 4. Functional Block Diagram for the 16-BJt, Two-Input Bus Mode 
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A floating-point number is evaluated or interpreted per the 
following conventions: 

let s = sign bit 

e = biased exponent 
f = fraction 

if e = and f = 0...value = {-1)^*(0) ( + 0, -0) 
If e = and f =)*= 0...value = denormalized number 
if < e < 255...value = (-1)S*(2^ " ^^)•^^^ .f) 

(normalized number) 
if e = 255 and f = 0...value = (-1)^*(°°) ( + ~, -~) 
if e = 255 and f ¥= 0...value = not-a-number (NAN) 

Zero: The value zero can have either a positive or negative 
sign. Rules for determining the sign of a zero produced by an 
operation are given in the Sign Bit section. 

Denormalized Number: A denormalized number represents a 
quantity with magnitude less than 2~ ^ ^® but greater than zero. 



Normaiized Number: A normalized number represents a 
quantity with magnitude greater than or equal to 2"^^ but 
less than 2 ^2*. 

Example 1: 

The number + 3.5 can be represented in floating-point 
format as follows: 

-^ 3.5 = 11.12X2° 
= 1.112X2^ 

sign = 

biased exponent = 1io + 127io = 128io 
= IOOOOOOO2 

fraction = 1 1 0000000000000000000002 

(the leading 1 is implied in the format) 

Concatenating these fields produces the floating-point word 
4060000016- 
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Y F DATA 



a) 32-Bit, Two-Input-Bus Mode 



yxxY 



YTKY 



xxk: 

JXXK 

xzz 



WF023730 



IXX 



X 



X 



b) 32-Bit, Single-Input-Bus Mode 



X 



F DATA - 16 LSBs 



X 



F DATA - 16 HSBs 



WF023740 



yyyyy — yyyyy — yyyy 
yyyyy-— yyyyy — yyy 



X 



c) 16-Blt, Two-Input-Bus Mode 
Figure 5. Typical Bus Timing for the I/O Modes with FTq = LOW, FTi = LOW 
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SIGN 
BIT(S) 



BIASED 
EXPONENT (E) 



FRACTION (F) 



BIT NUMBER: 31 30 29 



T 



^ M 25 24 23 22 21 20 19 18 

T — T— r 



2' 2« 25 2* 2' 22 2^ 20 
1 1 I 1 I ' I 



— 1 — r — I — I — r 

2-1 2-2 2-' 2-» 2-5 

1 1 I L_^X. 



4 3 2 10 



T I — I I — n 

2-19 2-20 2-21 2-22 2-23 
J I I I 1_ 



VALUED (-1)8 (2E-127),,F) 

Figure 6. IEEE Mode Single-Precision Floating-Point Format 
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Example 2: 

The number - 1 1 .375 can be represented in floating-point 
format as follows: 



-11.375= -1011.0112X2° 
= -1.0110112X2^ 

sign = 1 



biased exponent = 3io + 127io = 130io 
= 100000102 

fraction = 01 101 ICOOOOOOOOOOOOOOOO2 

(ttie leading 1 is implied in the format) 

Concatenating these fields produces the floating-point word 
CI 3600001 6- 



Infinity: Infinity can have either a positive or negative sign. 
The way in which infinities are interpreted is determi ned b y the 
state of the projective/affine mode select, PROJ/AFF. 

Not-a-Numben A not-a-number, or NAN, does not represent 
a numeric value, but is interpreted as a signal or symbol. NANs 
are used to indicate invalid operations, and as a means of 
passing process status information through a series of calcula- 
tions. NANs arise in two ways: 1 ) they can be generated by the 
Am29C325 to indicate that an invalid operation has taken 
place (e.g., "> x 0), or 2) be provided by the user as an input 
operand. There are two types of NANs, signalling and quiet 
(see Figure 7 for formats). 

IEEE Mode Integer Format 

Integer numljers are represented as 32-bit, two's-complement 
words (Figure 8 depicts the integer format). The integer word 
can represent a range of integer values from -2''^ to 2''^ - 1 . 
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QUIET NAN 


X 1 
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1 
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X 


X 
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X 


X 


X 


X 


X 


X 


XXX 


X 


X 


X 


X 


X 


X 


X 


X 


X 


■^ 



X = DON'T CARE 



AT LEAST ONE OF THE 
TWEKTY-TWO LSB» OF A QUIET « 
MUST BE 1 
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Figure 7. Signalling and Quiet NAN Formats 
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Figure 8. 32-Bit Integer Format 



Operations 



All eight floating-point ALU operations discussed In the 
Functional Description section can be performed In IEEE 
mode. Various exceptional aspects of the R PLUS S, R MINUS 
S, R TIMES S, 2 MINUS S, INT-TO-FP, and FP-TO-INT 
operations tor this mode are described below. The lEEE-TO- 
DEC and DEC-TO-IEEE operations are discussed separately 
in the lEEE-TO-DEC AND DEC-TO-iEEE Operations section. 



Operations with NANs: NANs arise in two ways: 1) they can 
be generated by the Am29C325 to Indicate that an invalid 
operation has tal<en place (e.g., " x 0), or 2) be provided by 
the user as an input operand. There are two types of NANs, 
signalling and quiet (see Figure 7 for formats). 

Signalling NANs set the invalid operation flag when they 
appear as an input operand to an operation. They are useful 
for indicating uninitialized variables, or for implementing user- 
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designed extensions to the operations provided. The ALU 
never produces a signalling NAN as the final result of an 
operation. 

Quiet NANs are generated for invalid operations. When they 
appear as an input operand, they are passed through most 
operations without setting the invalid flag, the floating-point-to- 
integer conversion operation being the exception. 

The sign of any input operand NAN is ignored. All quiet NANs 
produced as the final result of an operation have a sign of 0. 

When a NAN appears as an input operand, the final result of 
the operation is a quiet NAN that is created by taking the input 
NAN and forcing bit 22 LOW and bit 21 HIGH. If an operation 
has two NANs as input operands, the resulting quiet NAN is 
created using the NAN on the R port. 

When a quiet NAN is produced as the final result of an invalid 
operation whose input operand or operands are not NANs, the 
resulting NAN will always have the value 7FA00000i6- 

The NAN flag vwll be HIGH whenever an operation produces a 
NAN as a final result. 

Example 1: 

Suppose the floating-point addition operation is performed 
with the following input operands: 

R port: 3F80000016 (1-0*2°) 

S port: 7FC1 234516 (Signalling NAN) 

Result: The signalling NAN on the S port is converted to a 
quiet NAN by forcing bit 22 LOW and bit 21 HIGH. 
The operation's final result will be 7FA12345i6. 
Since one of the two input operands is a signalling 
NAN, the invalid flag will be HIGH; the NAN flag will 
also be HIGH. 

Example 2: 

Suppose the floating-point multiplication operation is per- 
formed with the following input operands: 

R port: FFFIIIII16 (signalling NAN) 
S port: 7FC22222i6 (quiet NAN) 

Result: Since both input operands are NANs, the NAN on 
the R port is chosen for output. In addition to forcing 
bit 22 LOW, the sign bit (bit 31) is set LOW (bit 21 is 
already HIGH, and need not be changed). The 
operation's final result will be 7FB11111i6- Since 
one of the two input operands is a signalling NAN, 
the invalid flag is HIGH; the NAN flag will also be 
HIGH. 

Example 3: 

Suppose the floating-point subtraction operation is per- 
formed with the following input operands: 

R port: FF8OOOOI16 (quiet NAN) 
S port: 7F8OOOOO16 ( + ") 

Result; To create the final result, the quiet NANs sign bit (bit 
31) is forced LOW and bit 21 is forced HIGH (bit 22 
is already LOW, and need not be changed). The final 
result will be 7FA00001i6. The NAN flag will be 
HIGH. 

Operations with Denormallzed Numbers: The proposed 
IEEE standard incorporates denormallzed numbers to allow a 
means of gradual underflow for operations that produce non- 
zero results too small to be expressed as a normalized 
floating-point number. The Am29C325 does not support 
gradual underflow. If a floating-point operation produces a 
non-zero rounded result that is not large enough to be 
expressed as a normalized floating-point number, the final 



result will be a zero of the same sign; the inexact, underflow, 
and zero flags will be HIGH. If an input operand is a 
denormallzed number, the floating-point ALU will assume that 
operand to be a zero of the same sign. 

Operations Producing Overflows: If an operation has a finite 
input operand or operands, and if the operation produces a 
rounded result that is too large to fit in the destination format, 
the operation is said to have overflowed. 

A floating-point overflow occurs if an R PLUS S, R MINUS S, R 
TIMES S, or 2 MINUS S operation with finite input operand(s) 
produces a result which, after rounding, has a magnitude 
greater than or equal to 2^^®. Positive or negative infinity will 
appear as the final result if the rounded result is positive or 
negative, respectively, and the overflow and inexact flags will 
be HIGH. 

Integer overflow occurs when the floating-point-to-integer 
conversion operation attempts to convert a number which, 
after rounding, is greater than 2^'' - 1 or less than -2^''. The 
final result will be quiet NAN 7FA00000i6, and the invalid 
operation and NAN flags will be HIGH. Note that the overflow 
and inexact flags remain LOW for integer overflow. 

Operations Producing Underflows: If an operation produces 
a floating-point rounded result having a magnitude too small to 
be expressed as a normalized floating-point number, but 
greater than zero, that operation is said to have underflowed. 
Underflow occurs when an R PLUS S, R MINUS S, or R 
TIMES S operation produces a result which, after rounding, 
has a magnitude in the range: 

< magnitude < 2"''^^. 

In such cases, the final result will be -^0 (OOOOOOOOie) if the 
rounded result is non-negative, and -0 (SOOOOOOOie) if the 
rounded result is negative. The underflow, inexact, aind zero 
flags will be HIGH. 

Underflow does not occur if the destination format is integer. If 
the infinitely precise result of a floating-point-to-integer con- 
version has a magnitude greater than and less than 1 , but 
the rounded result is 0, the underflow flag remains LOW. 

Operations witli Infinities: In most cases, positive and 
negative infinity are valid inputs for the R PLUS S, R MINUS S, 
R TIMES S, and 2 MINUS S operations. Those cases for which 
infinities are not valid inputs for these operations are listed in 
Table 4. 

Infinities in IEEE mode can be handled either as projective or 
affine. The projective mode is selected when PROJ /AFF is 
HIGH; the affine mode is selected when PROJ/AFF is LOW. 
The only differences between the modes that are relevant to 
Am29C325 operation occur during the addition and subtrac- 
tion of infinities: 



Operation 


Affine 
Mode 


Projective Mode 


(+oo)-H(-H<») 


Output +<=o 


Output 7FAOOOOO16 

(quiet NAN), set invalid and 

NAN flags 


(-■») -^ (-~) 


Output -■» 


Output 7FA0000016 

(quiet NAN), set invalid and 

NAN flags 


(+~) -(-■») 


Output +'" 


Output 7FAOOOOO16 

(quiet NAN), set invalid and 

NAN flags 


(-")-(+"=) 


Output -=° 


Output 7FAOOOOO16 

(quiet NAN), set invalid and 

NAN flags 
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If an R PLUS S, R MINUS S, or 2 MINUS S operation has 
Infinity as an input operand or operands, the final result, If 
valid, Is presumed to tie exact. For example, adding + °° and 
2,0 will produce a final result of +'>°; since the result Is 
considered exact, the Inexact flag remains LOW. 

Invalid Operations; If an input operand is Invalid for the 
operation to be performed, that operation is considered 
invalid. When an invalid operation is performed, the floating- 
point ALU produces a quiet NAN as the final result, and the 
invalid operation flag goes HIGH. Table 4 lists the cases for 
which the Invalid flag Is HIGH in IEEE mode, and the final 
results produced for these operations. 

TABLE 4. IEEE MODE INVALID OPERATIONS 



Operations +0 + (-0) and -0 + (+0) produce a result of 0, 
with the sign of the result determined by the table above. 

The operation + + (+ 0) produces a final result of + 0; the 
operation -0 + (-0) produces a final result of -0. 

R MINUS S: The operations + x - (+ x) and -x - (-x) produce a 
final result of zero; the sign of the zero Is dependent on the 
rounding mode: 



Rounding Mode 



Round to nearest 



Round toward -°° 



Round toward +"= 



Sign of Result 



Operation 



Input Operand 



Final Result 



Round toward 



R PLUS S 



or (-">) + (+'») 



7FA0000016 
(quiet NAN) 



R PLUS S 



(+ oo) + (+ oo) 

or (-=) + (-=») (Note 1) 



7FA00000ie 
(quiet NAN) 



R MINUS S 



(+~)-(+~) 
or (_ CO) _(_<») 



7FA0000016 
(quiet NAN) 



R MINUS S 



(+")-(-") 
or (-")-(+") (Note 1) 



7FA0000016 
(quiet NAN) 



R TIMES S 



(+0)*(+oo) 

or (+ 0) * (-■») 
or (-0) •(+■») 
or (-0) * (-■») 



7FA0000016 
(quiet NAN) 



R PLUS S 
R MINUS S 
R TIMES S 



R or S is a signalling 
NAN 



(Note 2) 



2 MINUS S 



S is a signalling NAN 



FP-TO-INT 



R is a signalling or 
quiet NAN 



FP-TO-INT 



R > 2-" - 1 
or R< -(2^^) 



(Note 2) 



(Note 2) 



7FAOO000i6 
(quiet NAN) 



Notes: 1. These cases are invalid In projeotive mode only. 

2. Results for these operations are described in the Operations 
with NANs section. 

The Sign Bit 

For most floating-point operations, the sign bit of the final 
result is unambiguous; i.e., there is only one sign bit value that 
yields a numerically correct result. Operations that produce an 
infinitely precise result of zero, however, present a problem, as 
the IEEE floating-point format allows for representation of both 
-I- and -0. The following rules can be used to determine the 
signs of zero produced in such cases. 

R PLUS S: The operations + x + (-x) and -x + (-!■ x) produce a 
final result of zero; the sign of the zero is dependent on the 
rounding mode: 



Operations -i- - (-)■ 0) and -0 - (-0) produce a result of 0, with 
the sign of the result determined by the table above. 

The operation +0-(-0) produces a final result of -i-O; the 
operation -0-(+0) produces a final result of -0. 

R TIMES S: The sign of any multiplication result other than a 

NAN is the exclusive OR of the signs of the Input operands. 

Therefore, if x is non-negative, 

-^0 times +x produces a final result of +0, 

+ times -X produces a final result of -0, 

-0 times +x produces a final result of -0, 

-0 times -X produces a final result of +0. 

2 MINUS S: If S equals 2, the final result is -0 for the round 
toward -<=" mode, and -i-O for all other rounding modes. 

Rounding 

Rounding is performed whenever an operation produces an 
infinitely precise result that cannot be represented exactly in 
the destination format. For example, suppose a floating-point 
operation produces the infinitely precise result: 

1. 101010101 01010101010101\01X 2^. 

In this example, the fraction portion of the mantissa has 25 
bits; the IEEE floating-point format can accommodate only 23. 
The backslash (\) in the mantissa represents the boundary 
between the first 23 bits of the fraction and any remaining bits. 
Rounding is the process by which this result Is approximated 
by a representation that fits the destination format. 

There are four rounding modes in IEEE mode; 1) round to 
nearest, 2) round toward -i-°°, 3) round toward -°°, and 4) 
round toward 0. The rounding mode is chosen using the 
rounding mode select lines, RNDq and RNDi. Table 5 lists the 
select states needed to obtain the desired rounding mode. 

TABLE 5. ROUNDING MODE SELECT 



Rounding Mode 



Sign of Final Result 



RNDi 



RNDo 



Rounding Mode 



Round to nearest 



Round to nearest 



Round toward -<> 



Round toward -o" 



Round toward +«> 



Round toward ■i-°° 



Round toward 



Round toward 
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Round to Nearest: In this rounding mode the infinitely precise 
result of an operation is rounded to the closest representation 
that fits in the destination format. If the infinitely precise result 
is exactly halfway between two representations, it Is rounded 
to the representation having an LSB of zero. Rounding is 
performed both for floating-point and integer destination 
formats. 

Figure 9 illustrates four examples of the round-to-nearest 
process for operations having a floating-point destination 
format. The infinitely precise result of an operation is repre- 
sented by an "X" on the number line; the black dots on the 
number line indicate those values that can be represented 
exactly in the floating-point format. 

Example 1: 

In Figure 9(a), the infinitely precise result of an operation is: 

220 + 2-4 + 2-5 = 1.00000000000000000000000\1 1 x 2^° 

The result is rounded to the closest representable floating- 
point value, 



2 + 2~ 



1.00000000000000000000001 X 2' 



.20 



Example 2: 
in Figure 9(b), the infinitely precise result of an operation is: 



o20. 



2-* + 2-' 



8. 



1.1 1111 1111 111111111 11 11l\0001x 2^^ 

This result is rounded to the closest representable floating- 
point value, 

220_2-4 = 1.11111111111111111111111x2^^ 

Example 3: 

In Figure 9(c), the infinitely precise result of an operation is: 

_(220 4.2-3 + 2-'*) 
= -1. 00000000000000000000001 \1 x^° 

This result is exactly halfway between two representable 
floating-point values. Accordingly, it is rounded to the 
closest representation with an LSB of zero, or 

- (2^° + 2*2-3) = _ 1 .0000000000000000000001 x ^° 

Example 4: 

In Figure 9(d), the infinitely precise result of an operation is: 

220 + 3-2-3 = 1 .0000000000000000000001 1 x 2^° 

This result can be represented exactly in the floating-point 
format, and Is left unaltered by the rounding process. 



_|220 _ 3 . 2-4) 



2M - 2-<— , ROUND TO 2» + 2"' 



2»-3-2-< 



1 ■ ,0 



I I I I I 

-(220 t 3 . 2-3) I -(220 * 2-3) | -(2» - 2 ■ 2"') 

-(2™ + 2 ■ 2-3) -(220) 



220-2-2-" I / 2=0 + 2-3 | 2^0 + 3 . 3-8 

220 / 2»> t 2 • 2-3 

ROUND TO 220 _ 2-4 ^ 2^ f 2-* + 2-5 



-• • e •—— lA- 



» * x « 



220 _ 2"* + 2~^ 



ROUND TO -{220 + 2" 



n. 



b) 



-(220 t 2-3 T 2-') 



c) 



NO CHANGE 

Q 



d) 



220 + 3 • 2-3 
AFtX)4550 



Figure 9. Floating-Point Rounding Examples for Round-to-Nearest Mode 
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Figure 10 illustrates four examples of tfie round-to-nearest 
process for operations having an integer destination format. 
The infinitely precise result of an operation is represented by 
an "X" on the number line; the black dots on the number line 
indicate those values that can be represented exactly in the 
integer format. 

Example 1: 

In Figure 10(a), the infinitely precise result of an operation is: 

2IO - 2-2 = 0O...OO1 111111111.11 



The result is rounded to the closest representable integer 
value, 

2^° = 00.. .01 0000000000 

Example 2: 

In Figure 1 0(b), the infinitely precise result of an operation is: 

2IO + 2° + 2-3 = 00...01 0000000001 .001 



This result is rounded to the closest representable integer 
value, 

2IO + 2° = 0O...OIOOOOOOOOOI 

Example 3: 

In Figure 10(c), the infinitely precise result of an operation is: 



(2''° + 2° + 2- 



1^=. 



11...101111111110.1 



This result is exactly halfway between two representable 
integer values. Accordingly, it is rounded to the closest 
representation with an LSB of zero, or 

_(210 + 2*2°) = 11.,.101111111110 

Example 4; 

In Figure 1 0(d), the infinitely precise result of an operation is: 

2IO + 3.2O = 0O...OI 0000000011 

This result can be represented exactly in the integer format, 
and is left unaltered by the rounding process. 



vH— >^ 



ROUND TO 2"> 



I I I I I 

-(2» 1- 3) -(2'" + 2) -(2«> + 1) -12™) -(2'» - 1) 



ROUND TO "(2™ + 2) 

•— X — *- 



-(210 4. 20 4. 2-l| 



I r 1 

2»- 1 



I I I 

2l0 +1 2^*' + 2 2™ + 3 



■vH— v^ 



/^ 2« 

2l0 _ 2-2 ROUND TO 2" + 1 

« • Mf • • 



t 
2" + 20 + 2-3 



V—f- /- 



V-f-v^ 



Q 



2» t 3 • 20 

AF004560 



Figure 10. Integer Rounding Exampies for Round-to-Nearest Mode 
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Round Toward -'=°: In this rounding mode the result of an 
operation is rounded to the closest representation that is less 
than or equal to the infinitely precise result, and which fits the 
destination format. Rounding is performed both for floating- 
point and integer destination formats. 

Figure 11 illustrates four examples of the round toward -°° 
process for operations having a floating-point destination 
format. The infinitely precise result of an operation is repre- 
sented by an "X" on the number line; the black dots on the 
number line indicate those values that can be represented 
exactly in the floating-point format. 

Example 1: 

In Figure 1 1 (a), the infinitely precise result of an operation is: 

gZO + 2-4 + 2" 5 = 1 .OOOOOOOOOOOOOOOOOOOOOOOXl 1 X 2^° 

This result cannot be represented exactly in floating-point 
format, and is rounded to the next-smaller floating-point 
representation: 

2^° = 1 .00000000000000000000000 X 2^° 

Example 2: 

In Figure 1 1 (b), the infinitely precise result of an operation is: 



220 _ 2-4 .,.2-8 ^ 
1.11111111111111111111 1 1\0001 X 2^^ 

This result cannot be represented exactly in floating-point 
format, and is rounded to the next-smaller floating point 
representation: 

220-2-4 = 1.11111111111111111111111 x2l^ 

Example 3: 

In Figure 11 (o), the infinitely precise result of an operation is: 

-(2^° + 2"^ + 2"'') = 
-1.00000000000000000000001\1 x22° 

This result cannot be represented exactly in floating-point 
format, and is rounded to the next-smaller floating-point 
representation. 

_(220 + 2*2-3) = -1.00000000000000000000010x22° 

Example 4: 

In Figure 1 1 (d), the infinitely precise result of an operation is: 

220 + 3.2-3 = 1 .0000000000000000000001 1 X 7p 

This result can be represented exactly in the floating-point 
format, and is left unaltered by the rounding process. 



_,2M _ 3 . 2-4)_ 



-(2» - 2-*) 



1 



-\ 






1 + 2-2-3) 



I 
_(220 + 2-3) 



-,220) 



-<220 - 2 • 2-4) 



ROUND TO -(2^" + 2 • 2-3) 



£X 



2» - 2 • 2-4 I /^ 2» + 2-5 I 220 + 3 ■ j-S 

a) 2^° ( 220 + 2 ■ 2-3 

ROUND TO 220 -2-* V, 320 + 2"* + 2-5 



-yi — I — / ■ » ' X » • • % 



b) 



220 ^ 2-4 + 2-8 



X • • • • • — ^ — \ — ^ — • — •— • — •- 



• m % 

NO CHANGE 



I 
_(220 + 2-3 + 2-*) 



C) 



-yH— ^ 



Q 



d) 



220+3-2-3 

AF004510 



Figure 11. Floating-Point Rounding Examples for Round Toward -°° Mode 
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Figure 12 illustrates four examples of the round toward -°° 
process for operations having an integer destination format. 
The infinitely precise result of an operation is represented by 
an "X" on the number line; the black dots on the number line 
indicate those values that can be exactly represented in the 
integer format 

Example 1: 

In Figure 1 2(a), the infinitely precise result of an operation Is: 



2iu_2- 



'00...001111111111.11 



The result is rounded to the next-smaller representable 
integer value, 

2lO_20 = oo...001111111111 

Example 2: 

In Figure 1 2(b), the infinitely precise result of an operation is: 

2IO + 2° + 2'^ = 00...01 0000000001 .001 



This result is rounded to the next-smaller representable 
integer value, 

gio + 2° = OC.OIOOOOOOOOOI 

Example 3: 

In Figure 12(c), the infinitely precise result of an operation is: 

_(210 + 2° + 2" ■I) = 11. ..1011 11 111110.1 



This result is rounded to the next-smaller representable 
integer value: 

-(2''°-i-2*2'') = 11...101111111110 

Example 4: 

In Figure 1 2(d), the infinitely precise result of an operation is: 

2IO + 3.30 ^ 0O...OI 000000001 1 

This result can be represented exactly in the integer format, 
and is unaltered by the rounding process. 



I I I I I 

-(2» + 3) -(JW + J) -(Jin + 1) -(2ll>| -IJlO _ ,) 



ROUND TO 2^0- 1 



■^M-v i — n^ 



I I I I 

2"> - 1 / 2'0 2"> +1 2'» + 2 2'" <• 3 



c 



-/— |_v- 



''° - 2 ^ ROUND TO 2"! + 1 

a. 



ROUND TO -<2'° + 2) 

•— X — •- 



-{210 -t- 20+ 2-1) 



♦ 



b) 



V^-| >A_ 



»— _ 

NO CHANGE 

-a 



d) 



Figure 12. Integer Rounding Examples for Round Toward -°° Mode 
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Round Toward +<»: In this rounding mode the result of an 
operation is rounded to ttie closest representation that is 
greater than or equal to the infinitely precise result, and which 
fits the destination format. Rounding is performed both for 
floating-point and integer destination formats. 

Figure 13 illustrates four examples of the round toward +■» 
process for operations having a floating-point destination 
format. The infinitely precise result of an operation is repre- 
sented by an "X" on the number line; the black dots on the 
number line indicate those values that can be represented 
exactly in the floating-point format 

Example 1: 

In Figure 1 3(a), the infinitely precise result of an operation is: 

220 + 2-4 + 2-5 = .| .00000000000000000000000\1 1 X 2^° 

This result cannot be represented exactly in floating-point 
format, and is rounded to the next-larger floating-point 
representation: 

220 + 2-3 = 1 .00000000000000000000001 x 2^° 

Example 2: 

In Figure 13(b), the infinitely precise result of an operation is; 



,20 



2-1 + 2-0 



1.1 1111 11111111 111111 111l\0001x 2^^ 

This result cannot be represented exactly in floating-point 
format, and is rounded to the next-larger floating point 
representation: 

2^° = 1.00000000000000000000000 X 2^ 

Example 3: 

In Figure 1 3(c), the infinitely precise result of an operation is: 

_(220 + 2-3-H2-'') = 

- 1. 00000000000000000000001 \1 x 2^° 

This result cannot be represented exactly in floating-point 
format, and is rounded to the nextrlarger floating-point 
representation. 

_(220 + 2-3) = _ 1 .0000000000000000000001 x 2^ 
Example 4: 
In Figure 13(d), the infinitely precise result of an operation is: 
220 + 3.3-3 ^ ., .0000000000000000000001 1 X 2^°. 

This result can be represented exactly in the floating-point 
format — no rounding takes place. 



-(2» - 3 • 2-*)- 



_(2M _ 2-<) 



1 



ROUND TO 2^0 + 2-3 



220 _ 3 . 2- 

-vH— ^ 



1 



—1 ROUND TO 2^ 

1 ■ ,0 



» + 3 • 2-3) I -(JM + 2-3) I 



-(220 + 3-2-3) 1 _(220 + 2-3) | -(^ -2-2-*) 

_(220 + 2 • 2-3) -1220) 



2»-2 



• 2-« I / 2» + 2-3 I 2M + 3 • 2-3 



ROUND TO 2*" _. ^ 2» 



* « — • — / — — /— • — • — 4^ 



a 



/ 2» + 

V 220 + 2-< + 2-5 



2-2-3 



ROUND TO 220 + 2-3 



n 



b) 



2» - 2-* + 2-* 



_(22Q + 2-3 + 2-*) 



Q 



d) 



220 + 3 - 2-3 
AF004590 



Figure 13. Floating-Point Rounding Examples for Round Toward +°° Mode 
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Figure 14 illustrates four examples of tfie round toward +°° 
process for having an integer destination format. Tlie infinitely 
precise result of an operation is represented by an "X" on the 
number line; the black dots on the number line indicate those 
values that can be exactly represented in the integer format. 

Example 1: 

In Figure 14(a), the infinitely precise result of an operation is: 



,10 



-2^ 



O0...001111111111,11 



The result is rounded to the next-larger representable 
Integer value, 

2^° = 00...01 0000000000 

Example 2: 

in Figure 14(b), the infinitely precise result of an operation is: 

2IO ^.30 + 2-3 = 00...01 0000000001. 001 



This result is rounded to the next-larger representable 
integer value, 

2I0 + 2-20 = 0O...OIOOOOOOOOIO 

Example 3: 

In Figure 14(c), the infinitely precise result of an operation is: 

_(210 + 2" + 2"^) = 11.10111 11 11 110.1 

This result Is rounded to the next-larger representable 
integer value: 

_(210 + 2<') = 11...1011111111110 

Example 4: 

In Figure 14(d), the infinitely precise result of an operation is; 

2IO + 3«20 = 00...010000000011 

This result can be represented exactly in the integer 
format — no rounding takes place. 



I I I I I 

-(2" + 3) -(2'" + 2) -IZ'" + 1) -(2'0) -(2'l> - 1) 



-/—\ /- 



a) 



V— I— v^ 



¥-!■ 
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2lO + 2 
ROUND TO 2"" + 2 



-Hf 



2'°+ 3 



BOUND TO -(2" + 1| 



n 



b) 
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c) 



3 CHANG 
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Figure 14. Integer Rounding Examples for Round Toward +°° IMode 
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Round Toward 0: In this rounding mode the result of an 
operation is rounded to the closest representation whose 
magnitude is less than or equal to the infinitely precise result, 
and which fits the destination format. Rounding is performed 

both for floating-point and integer destination formats. 

Figure 15 illustrates four examples of the round toward 
process for operations having a floating-point destination 
format. The infinitely precise result of an operation is repre- 
sented by an "X" on the number line; the black dots on the 
number line indicate those values that can be represented 
exactly in the floating-point format. 

Example 1: 
In Figure 1 5(a), the infinitely precise result of an operation is: 
220 + 2-t -1- 2-5 = 
1 .OOOOOOOOOOOOOOOOOOOOOOOVl 1 X 2^° 

This result cannot be represented exactly in floating-point 
format, and is rounded to; 

2^° = 1.00000000000000000000000 X 2^° 



Example 2; 
In Figure 1 5(b), the infinitely precise result of an operation is: 



+ 2-8 = 



220 _ 2-4 

1.11 11 11 11 1111 1111 1111 111\001 \2^^ 

This result cannot be represented exactly in floating-point 
format, and is rounded to: 

2^° -2-* = ^.^^^^^^■^^1■^^\■^■^^u^^^■\^: x2^^ 

Example 3: 

In Figure 1 5(c), the infinitely precise result of an operation is: 

-(220 + 2-3 + 2-^*) = 
-1.00000000000000000000001\1 x2^° 

This result cannot be represented exactly in floating-point 
format, and is rounded to: 

-(2^° + 2-3) = _ 1 .00000000000000000000001 x 2^° 
Example 4; 
In Figure 1 5(d), the infinitely precise result of an operation is: 
220 ^ 3.2-3 ^ ., .0000000000000000000001 1 x 2^° 

This result can be represented exactly in the floating-point 
format, and is unaffected by the rounding process. 
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Figure 15. Floating-Point Rounding Examples for Round Toward Mode 
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Figure 16 illustrates four examples of Xhe round toward 
process for operations having an integer destination format. 
The infinitely precise result of an operation is represented by 
an "X" on the number line; the black dots on the number line 
indicate those values that can be exactly represented in the 
integer format. 

Example 1: 

In Figure 16(a), the infinitely precise result of an operation is: 



210_2- 



= 00...001111111111.11 



The result is rounded to: 
2^°-2° = 00...001111111111 
Example 2: 
In Figure 1 6(b), the infinitely precise result of an operation is: 



210 + 20 + 2- 



■3_ 



00...010000000001.001 



The result is rounded to: 



,10 J 



2'" + 2" = 00.. .010000000001 
Example 3: 

In Figure 1 6(o), the infinitely precise result of an operation is: 

_(210 + 2O + 2-1) = ii..ioil 111 11 110.1 

The result is rounded to: 

-(2^° + 2°) = 11... 101111111111 
Example 4: 

In Figure 16(d), the infinitely precise result of an operation is: 

2IO + 3.2O „ 0O...OIOOOOOOOO1 1 

This result can be represented exactly in the integer format, 
and is unaffected by the rounding process. 



! i I I I 

-RW * 3) -(2« t 2| -RW ♦ 1) -(J'O) -(a" - 1) 
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Frgure 16. Integer Rounding Examples for Round Toward Mode 



Flag Operation 

The Am29C325 generates six status flags to monitor floating- 
point processor operation. The following is a summary of flag 
conventions in IEEE mode: 

Invalid Operation Flag: The invalid operation flag is HIGH 
when an input operand is invalid for the operation to be 
performed. Table 4 lists the cases for which the invalid 
operation flag is HIGH in IEEE mode, and the corresponding 
final result. In cases where the invalid operation flag is HIGH, 
the overflow, underflow, zero, and inexact flags are LOW; the 
NAN flag will be HIGH. 

Overflow Flag: The overflow flag is HIGH if an R PLUS S, R 
MINUS S, R TIMES S, or 2 MINUS S operation with finite input 
operand(s) produces a result which, after rounding, has a 
magnitude greater than or equal to 2''^^. The final result will 
be += or -"=. 

Underflow Flag: The underflow flag is HIGH if an R PLUS S, 
R MINUS S, or R TIMES S operation produces a result which, 
after rounding, has a magnitude in the range: 
0< magnitude < 2"''^^. 



The final result will be + (OOOOOOOO-ie) if the rounded result is 
non-negative, and -0 (BOOOOOOOie) if the rounded result is 
negative. 

Inexact Flag: The inexact flag is HIGH if the final result of an 
R PLUS S, R MINUS S, R TIMES S, 2 MINUS S, INT-TO-FP, or 
FP-TO-INT operation is not equal to the infinitely precise 
result. Note that if the underflow or overflow flag is HIGH, the 
inexact flag will also be HIGH. 

Zero Flag: The zero flag is HIGH if the final result of an 
operation is zero. For operations producing an IEEE floating- 
point number, the flag accompanies outputs -HO (OOOOOOOOie) 
and -0 (SOOOOOOOie). For operations producing an integer, 
the flag accompanies the output (OOOOOOOOie). 

NAN Flag: The NAN flag is HIGH if an R PLUS S, R MINUS S, 
R TIMES S, 2 MINUS S, or FP-TO-INT operation produces a 
NAN as a final result. 

Operation In DEC Mode 

When input signal IEEE/DEC is LOW, the DEC mode of 
operation is selected. In this mode the Am29C325 uses the 
single-precision floating-point format (floating F) set forth in 
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Digital Equipment Corporation's VAX Architecture Manual. In 
addition, the DEC mode complies with most other aspects of 
single-precision floating-point operation outlined in the manu- 
al—differences are discussed in Appendix B. 

DEC Floating-Point Format 

The DEC single-precision floating-point word is 32 bits wide, 
and is an-anged in the format shown in Figure 17. The floating- 
point word is divided into three fields; a single-bit sign, an 8-bit 
biased exponent, and a 23-bit fraction. 

The sign bit indicates the sign of the floating-point number's 
value. Non-negative values have a sign of 0, negative values a 
sign of 1. 

The biased exponent is an 8-bit unsigned integer field repre- 
senting a multiplicative factor of some power of two. The bias 
value is 128. If, for example, the multiplicative factor for a 
floating-point number is to be 2^, the value of the biased 
exponent would be a -H 28; "a" is called the true exponent. 

The fraction is a 23-bit unsigned fractional field containing the 
23 LSBs of the floating-point number's 24-bit mantissa. The 
weight of this field's MSB is T^; the weight of the LSB is 2"^'*. 

A floating-point number is evaluated or interpreted per the 
following conventions: 
let s =sign bit 

e = biased exponent 

f = fraction 



if e = and s = C.value = 
if e = and s = 1 ...value = 



if < e < 255...value = (-1)=*(2' 
(normalized number) 



DEC-reserved operand 



,e-128^ 



roif) 



Zero: The value zero always has a sign of zero. 

DEC-Reserved Operand: A DEC-reserved operand does not 
represent a numeric value, but is interpreted as a signal or 
symbol. DEC-reserved operands are used to indicate invalid 
operations and operations whose results have overflowed the 
destination format. They may also be used to pass symbolic 
information from one calculation to another. 



Normalized Number: A normalized number represents a 
quantity with magnitude greater than or equal to 2"^^^ but 
less than Z^^''. " 

Example 1: 

The number -h 3.5 can be represented in floating-point 
format as follows: 



-H 3.5 = 11.12X2" 
= .1112X2^ 



sign = 

biased exponent = 2io + 128io = 130io 
= 1000001 02 

fraction = IIOOOOOOOOOOOOOOOOOOOOO2 

(the leading 1 is implied in the format) 

Concatenating these fields produces the floating-point word 
4I6OOOOO16. 

Example 2: 

The number - 1 1 .375 can be represented in floating-point 
format as follows: 

-11.375 = -1011.0112X2° 
= -.10110112X2'* 

sign = 1 

biased exponent = 4io+ 128io = 132io 
= 1 00001 OO2 

fraction = 011011000000000000000002 

(the leading 1 is implied in the format) 

Concatenating these fields produces the floating-point word 
C2360000i6. 
DEC Mode Integer Format 

DEC mode integer format is identical to that of the IEEE mode. 
Integer numbers are represented as 32-bit, two's-compiement 
words (Figure 8 depicts the integer format). The integer word 
can represent a range of integer values from -2^^ to 2''^ - 1. 



Operations 

All eight floating-point ALU operations discussed in the 
General Description section can be performed in DEC mode. 



SIGN 

Bn'(S) 



BIASED 
EXPONENT (E) 



FRACTION (F) 



BIT NUMBER: 31 30 29 



27 26 25 24 23 22 21 20 19 IS 



1 1 1 1 1 1 1 

2? 2« 2= 2* 2' 22 2' 2" 
■III' I I 



-i — r 



"1 — r 



■2 2~3 2~* 2-5 2-fi 



2-20 2-2' 2-22 2-23 2-2» 
_i I I I I 



VALUE = (-1)5(2^-128) (IF) 



Figure 17. DEC-Mode Floating-Point Format 



TB000671 



Various exceptional aspects of the R PLUS S, R MINUS S, R 
TIMES S, 2 MINUS S, INT-TO-FP, and FP-TO-INT operations 
for this mode are described below. The lEEE-TO-DEC and 
DEC-TO-IEEE operations are discussed separately in the 
IEEE-TO-DEC and DEC-TO-IEEE Operations section. 

Operations with DEC-Reserved Operands: DEC-reserved 
operands arise in two ways: 1 ) they can be generated by the 
Am29325 to indicate that an invalid operation or floating-point 



overflow has taken place, or 2) be provided by the user as an 
input operand. 

When a DEC-reserved operand appears as an input operand, 
the final result of the operation is the same DEC-reserved 
operand. If an operation has two DEC-reserved operands as 
inputs, the DEC-reserved operand on the R port becomes the 
final resulL 

The NAN flag will be HIGH whenever an operation produces a 
DEC-reserved operand as a final result. 
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Example 1: 

Suppose the floating-point addition operation is performed 
witfi tfie following input operands: 

R port: 4O8OOOOO16 (0.1*2^) 

S port: 8001234516 (DEC-reserved operand) 

Result: This operation produces the DEC-reserved operand 
on the S port, 800 1 2345 1 6, as the final result. The 
NAN flag will be HIGH. 

Example 2; 

Suppose the floating-point multiplication operation is per- 
formed with the following input operands; 

R port: 8076543216 (DEC-resen^ed operand) 
S port: 80000001 16 (DEC-reserved operand) 

Result: Since both input operands are DEC-reserved oper- 
ands, the operand on the R port, 80765432i6, is the 
final result of the operation. The NAN flag will be 
HIGH. 

Operations Producing Overflows: If an operation produces 
a rounded result that is too large to fit in the the destination 
format, that operation is said to have overflowed. 

A floating-point overflow occurs if an R PLUS S, R MINUS S, R 
TIMES S, or 2 MINUS S operation with finite input operand(s) 
produces a result which, after rounding, has a magnitude 
greater than or equal to 2^^^. The final result in such cases will 
be DEC-reserved operand 8OOOOOOO16; the overflow, inexact, 
and NAN flags will be HIGH. 

Integer overflow occurs when the "floating-point-to-integer" 
conversion operation attempts to convert to integer a floating- 
point number which, after rounding, is greater than 2^' - 1 or 
less than -2^''. The final result in such cases will be DEC- 
reserved operand SOOOOOOOie; the invalid operation flag will 
be HIGH. Note that the overflow and inexact flags remain 
LOW for integer overflow. 

Operations Producing Underflows: If an operation produces 
a floating-point result which, after rounding, has a magnitude 
too small to be expressed as a normalized floating-point 
number, but greater than 0, that operation is said to have 
underflowed. Underflow occurs when an R PLUS S, R MINUS 
S, or R TIMES S operation produces a result which, after 
rounding, has the magnitude: 

< magnitude <2~''28. 

The final result in such cases will be (OOOOOOOOie) The 
underflow, inexact, and zero flags will be HIGH. 

Underflow does not occur if the destination format is integer. If 
the infinitely precise result of a floating-point-to-integer con- 
version has a magnitude greater than and less than 1, but 
the rounded result is 0, the underflow flag remains LOW. 

Invalid Operations: If an input operand is invalid for the 
operation to be performed, that operation is considered 
invalid. There is only one invalid operation in DEC mode: 
performing a floating-point-to-integer conversion on a value 
too large to be converted to an integer. In this case, the final 
result will be DEC-reserved operand SOOOOOOOie, and the 
invalid operation and NAN flags will be HIGH. 

Sign Bit 

For all operations producing a DEC floating-point result, the 
sign bit of the final result is unambiguous; i.e., there is only one 
sign bit value that yields a numerically correct result. 



Rounding 

There are four rounding modes for DEC operation: 1) round to 
nearest, 2) round toward +'», 3) round toward -", and 4) 
round toward 0. The round toward + ~, round toward -<», and 
round toward modes are performed in a manner identical to 
that for IEEE operation; refer to the Rounding section under 
Operation in IEEE Mode. The round to nearest mode is 
similar to that for IEEE operation, but differs in one respect: for 
the case in which the infinitely precise result of an operation is 
exactly halfway between two representable values, DEC round 
to nearest mode rounds to the value with the larger magni- 
tude, rather than to the value whose LSB is 0. 

Flag Operation 

The Am29C325 generates six status flags to monitor floating- 
point processor operation. The following is a summary of flag 
operation in DEC mode: 

Invalid Operation Flag: The invalid operation flag is HIGH if 
the FP-TO-INT operation is performed on a floating-point 
number too large to be converted to an integer. The final result 
for such an operation will be the DEC-reserved operand 
8OOOOOOO16. 

Overflow Flag: The overflow flag is HIGH if an R PLUS S, R 
MINUS S, R TIMES S, or 2 MINUS S operation produces a 
result which, after rounding, has a magnitude greater than or 
equal to 2^^^. The final result will be the DEC-resen/ed 
operand SOOOOOOOie. 

Underflow Flag: The underflow flag is HIGH if an R PLUS S, 
R MINUS S, or R TIMES S operation produces a result which, 
after rounding, has a magnitude in the range: 

< magnitude < 2" ^^8. 

The final result will be (OOOOOOOO16) in such cases. 

Inexact Flag: The inexact flag is HIGH if the final result of an 
R PLUS S, R MINUS S, R TIMES S, 2 MINUS S, INT-TO-FP, or 
FP-TO-INT operation is not equal to the infinitely precise 
result. Note that if the underflow or overflow flag is HIGH, the 
inexact flag will also be HIGH. 

Zero Flag: The zero flag is HIGH if the final result of an 
operation is 0. For operations producing an integer or a DEC 
floating-point number, the flag accompanies the output 
(OOOOOOOO16). (It should be noted that any operation produc- 
ing a floating-point in DEC mode will output OOOOOOOOie.) 

NAN Flag: The NAN flag is HIGH if an R PLUS S, R MINUS S, 
R TIMES S, 2 MINUS S, or FP-TO-INT operation produces a 
DEC-reserved operand as the final result 

lEEE-TO-DEC and DEC-TO-IEEE Operations 

The IEEE-TO-DEC and DEC-TO-IEEE operations are used to 
convert floating-point numbers between the IEEE and DEC 
forma ts. Bo th operations work in a manner independent of the 
IEEE/DEC mode control. 

IEEE-TO-DEC Conversion 

The operation converts an IEEE floating-point number to DEC 
floating-point format. Most conversions are exact; in no case 
does the round mode have any effect on the final result There 
are, however, a few exceptional cases: 

a) If the IEEE floating-point input has a magnitude greater than 
or equal to 2^^', it is too large to be represented by a DEC 
floating-point number. The final result will be the DEC- 
reserved operand SOOOOOOOie! the overflow, inexact, and 
NAN flags will be HIGH. 
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b) If the IEEE floating-point input is a NAN, the final result will 
be the DEC-resatved operand 80000000 is; the invalid and 
NAN flags will be HIGH. 

c) If the IEEE floating-point input is a denormalized number, 
the final result will be a DEC (OOOOOOOis); the zero flag 
will be HIGH. 

d) If the IEEE floating-point input is + or -0, the final result 
will be a DEC (OOOOOOOie); the zero flag will be HIGH. 

DEC-TO-IEEE Conversion 

This operation converts a DEC floating-point number to IEEE 
floating-point format. Most conversions are exact; in no case 
does the round mode have any effect on the final result. There 
are, however, a few exceptional cases: 



a) If the DEC floating-point input is not 0, but has a magnitude 
less than 2"^^®, it is too small to be expressed as a 
normalized IEEE floating-point number. The final result will 
be an IEEE floating-point having the same sign as the 
input (OOOOOOO16 for positive inputs and 8OOOOOOO16 for 
negative inputs); the underflow, inexact, and zero flags will 
be HIGH. 

b) If the DEC floating-point input is a DEC-resen/ed operand, 
the result will be quiet NAN 7FA0000i6; the invalid opera- 
tion and NAN flags will be HIGH. 

c) If the DEC floating-point input is 0, the final result will be 
IEEE floating-point + (OOOOOOOie); the zero flag will be 
HIGH. 
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APPENDIX A 

DIFFERENCES BETWEEN THE IEEE 
PROPOSED STANDARD FOR BINARY 
FLOATING-POINT ARITHMETIC AND THE 
Am29C325'S IEEE MODE 

When operated in IEEE mode, the Ain29C325 High-Speed 
Floating-Point Processor complies with the single-precision 
portion of the IEEE Proposed Standard for Binary Floating- 
Point Arithmetic (P754, draft 10.0) in most respects. There are, 
however, several differences: 

Denormalized Numbers 

The Am29C325 does not handle denormalized numbers. A 
denormalized input will be converted to zero of the same sign 
before the specified operation takes place. The operation 
proceeds in exactly the same manner as if the input were + 
or -0, producing the same numerical result and flags. 

If the result of an operation, after rounding, has a magnitude 
smaller than 2 "^2°, the result is replaced by a zero of the 
same sign. 

Representation of Overflows 

In some rounding modes the proposed IEEE standard requires 
that overflows be represented as the format's most-positive or 
most-negative finite number. In particular: 

-When rounding toward 0, all overflows should produce a 
result of the largest representable finite number with the 
sign of the intermediate result. 

-When rounding toward -", all positive overflows should 
produce a result of the largest representable positive finite 
numfcier. 

- When rounding toward + », all negative overflows should 
produce a result of the largest representable negative finite 
number. 

The Am29C325, however, always represents positive over- 
flows as +■» and negative overflows as -«>, regardless of 
rounding mode. 

Projective Mode 

The proposed IEEE standard provides only for an affine mode 
to control the handling of infinities. The Am29C325 provides 



both affine and projective modes; the desired mode can be 
selected by the user. 

Traps 

The proposed IEEE standard stipulates that the user be able 
to request a trap on any exceptbn. The Am29C325 does not 
support trapped operation, and behaves as if traps are 
disabled. 

Resetting of Flags 

The proposed IEEE standard states that once an exception 
flag has been set, it is reset only at the user's request. The 
Am29C325's flags, however, reflect the status of the most 
recent operation. 

Generation of the Underflow Flag 

The proposed IEEE standard suggests several possible crite- 
ria for determining if underflow occurs. These criteria generate 
underflow flags that differ in subtle ways. The underflow 
criteria chosen for the Am29C325 stipulate that underflow 
occurs if: 

a) the rounded result of an operation has a magnitude in the 
range: 



0<magnitde<2-'2S, 



and 



b) the final result is not equal to the infinitely precise result. 

Since the Am29C325 never produces a denormalized number 
as the final result of a calculation, condition (b) is true 
whenever (a) is true. Note then that the operation of the 
Am29C325's underflow flag is somewhat different than that of 
an "IEEE standard" system using the same underflow criteria. 
For example, if an operation should produce an infinitely 
precise result that is exactly 2" ''2^, an "IEEE standard" 
system would produce that value as the final result, expressed 
as a denormalized number. Since that system's final result is 
exact, the underflow flag would remain LOW. The Am29C325, 
on the other hand, would output zero; since its final result is 
not exact, the underflow flag would be HIGH. 
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DIFFERENCES BETWEEN DEC VAX AND 
Am29C325 DEC MODE 

Operation in DEC mode complies witli most aspects of single- 
precision floating-point operation outlined in the Digital Equip- 
ment Corporation's VAX Architecture Manual. However, there 
are some differences that should be noted: 

Format 

The Am29C325's DEC format is: 



sign 

exponent 

mantissa 


-bit 31 
-bits 30-23 
-22-0 


X format is: 




sign 

exponent 

mantissa 


-bit 15 
-14-7 
-bits 6-0, bits 31 



-16 

In both cases, fields are listed from MSB to LSB, with bit 31 
the MSB of the 32-bit word. The Am29C325's DEC format can 
be converted to VAX format by swapping the 16 LSBs and 16 
MSBs of the 32-bit word. 

Flags vs. Exceptions 

In DEC VAX operation, certain unusual conditions arising 
during system operation may incur an exception, or an 
indication to the operating system that special handling is 
needed. 

The VAX recognizes a number of arithmetic exceptions. The 
following exceptions are relevant to the operations supported 
by the Am29C325: 

Integer Overflow Trap: indicates that the last operation 
produced an integer overflow. The LSBs of the correct result 
are stored in the destination operand. 

Floating-Point Overflow Trap/Fault: indicates that the last 
operation produced, after normalization and rounding, a float- 
ing-point number with magnitude greater than or equal to 2^^^. 
A trap replaces the destination operand with the DEC- 
reserved operand 80000000ie; a fault leaves the destination 
operand unchanged. 

Floating-Point Underflow Trap/Fault: indicates that the last 
operation produced, after normalization and rounding, a float- 



ing-point number with magnitude less than 2' 



A trap 



replaces the destination operand with zero; a fault leaves the 
destination operand unchanged. 

Reserved Operand Fault: indicates that the last operation 
had a resen/ed operand as an input. The destination operand 
is unchanged. 

The Am29C325 does not directly support DEC traps and 
faults. Rather, it indicates unusual conditions by setting one or 
more of the six status flags HIGH. Table D2 describes flag 
operation in DEC mode. 

Integer Overflow 

In cases of integer overflow, the VAX signals the integer 
overflow trap and stores the LSBs of the correct result. The 
Am29C325 sets the invalid operation flag and outputs the 
DEC-reserved operand 8OOOOOOO16. 

Floating-Point Underflow/Overflow Operation 

The VAX Architecture Manual specifies the action to be tal<en 
on the destination operand when floating-point underflow or 
overflow is encountered. The Am29C325 has no immediate 
control over this destination operand, as it resides somewhere 
off-chip, either in a register or memory location. This isn't so 
much a difference between the VAX specification and 
Am29C325 operation as it is a difference in scope. 

The Am29C325 responds to floating-point underflow by pro- 
ducing a final result of (OOOOOOOOie); the underflow, inexact, 
and zero flags will be HIGH. It responds to floating-point 
overflow by producing the DEC-reserved operand 8OOOOOOO16 
as the final result; the overflow, inexact, and NAN flags will be 
HIGH. 

Handling of DEC-Reserved Operands 

If an operation has a DEC-reserved operand as an input, the 
Am29C325 will produce that operand as the final result. If an 
operation has two input arguments and both are DEC- 
reserved operands, the operand on port R becomes the final 
result. For the VAX, operations with a DEC-reserved operand 
input or inputs do not modify the destination operand. As 
mentioned above, control of the destination operand is be- 
yond the scope of the Am29C325's operation. 

Inexact Flag 

The Am29C325 provides an inexact flag to indicate that the 
final result produced by an operation is not equal to the 
infinitely precise result. The VAX does not provide this flag. 
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APPENOrX c 

PERFORMING FLOATING-POINT DIVISION 
ON THE Am29C325 

While the Am29C325 does not have a floating-point division 
instruction, it can be used to evaluate reciprocals. The 
division: 

C = A/B 
can then be performed by evaluating: 

C = A*(1/B) 

Only a modest amount of external hardware is needed to 
Implement the reciprocal function. 

The technique for calculating reciprocals Is based on the 
Newton-Raphson method for obtaining the roots of an equa- 
tion. The roots of equation: 

F(x) = 
can be found by iteratlvely evaluating the equation: 



xi +r 



F{xi)/F'(Xi) 



The process begins by malting a guess as to the value of X|, 
and using this guess or "seed" value to perform the first 
iteration. Iterations are continued until the root Is evaluated to 
the desired accuracy. The number of Iterations needed to 
achieve a given accuracy depends both on the accuracy of the 
seed value and the nature of F(x). 

Now consider the equation: 

F(x) = (1/x) - B 

The root of F(x) is 1 /B. The reciprocal of B, then, can be found 
by using the Newton-Raphson method to find the root of F(x). 
The iterative equation for finding the root is: 



Xi+1 



= Xi-F(Xi)/F'(X,) 

= Xi-(1/Xj-B)/-(Xi)- 

= Xi (2-B*Xi) 



It can be shown that, In order for this iterative equation to 
converge, the seed value xq must fall In the range: 



< xo < 2/B 
2/B < Xo < 



If B>0 
If B<0 



For example. If the reciprocal of 3 is to be evaluated, the seed 
value must t>e between and 2/3. 

The error of X| reduces quadratically; that Is, if the error of X| is 
e, the error is reduced to order e^ by the next iteration. The 
number of bits of accuracy in the result, then, roughly doubles 
after every iteration. While this Is only an approximation of the 
actual error produced. It is a handy rule of thumb for 
determining the number of iterations needed to produce a 
result of a certain accuracy, given the accuracy of the seed. 

Example 1: 

Find the reciprocal of 7.25. 

Solution: 



The seed value must fall in the range: 

< xo < 2/7.25 
or < Xo < .275862 

Suppose Xq Is chosen to be .1: 



Iteration 1: xi = xq (2-B*xo) 

= .1(2-(7.25) (.1)) 
= .1275 

Iteration 2: X2 = xi (2-B*xi) 

= .1275(2 -(7.25) (.1275)) 
= .1371421875 

Iteration 3: X3 = X2 (2 - B*X2) 
= .1371421875* 

(2 -(7.25) (.1371421875)) 
= .1379265230 

The actual value of 1/7.25, to ten decimal places, is 
.1379310345. 

The error after each iteration is: 



Iteration 


Xi 


Error to Ten Places 





0.1 


-0.0379310345 


1 


0.1275 


-0.0104310345 


2 


0.1371421875 


- 0.0007888470 


3 


0.1379265230 


-0.0000045115 



Example 2: 

Find the reciprocal of -0.3. 

Solution: 

The seed value must fall In the range: 

2/(-0.3) < xo < 
or -6.66 < XQ < 

Suppose Xo is chosen to be -2.0: 

Iteration 1: xi = xq (2-B*xo) 

= -2.0(2- (-0.3) (-2.0)) 
= -2.8 

Iteration 2: X2 = xi (2-B*xi) 

= -2.8(2 -(-0.3) (-2.8)) 
= -3.248 

Iteration 3: X3 = X2 (2-B'X2) 

--3.248(2-(-0.3) (-3.248)) 
= -3.3311488 

Iteration 4: X4 = X3 (2 - B'xs) 
= -3.3311488* 

(2-(-0.3) (-3.3311488)) 
- -3.333331902 

The actual value of 1/(-0.3), to ten decimal places, Is 
-3.333333333. 

The error after each iteration is: 



i 


Xi 


Error to Ten Places 





-2.0 


1.333333333 


1 


-2.8 


0.533333333 


2 


-3.248 


0.085333333 


3 


-3.3311488 


0.002184533 


4 


-3.333331902 


0.000001431 



In order to Implement the Newton-Raphson method on the 
Am29C326, some means Is needed to generate the seed used 
In the first iteration. One approach Is to place a hardware seed 
look-up table between the R bus and the Am29C325; see 
Table C1. A more detailed diagram of the look-up table 
appears in Figure C2. 
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TABLE CI. CONTENTS OF THE SEED EXPONENT PROM 



DEC 


IEEE 


Address (16) 


Data (16) 


Address (16) 


Data (16) 


000 


(Note 1) 


100 


(Note 1) 


001 


(Note 1) 


101 


FC 


002 


FF 


102 


FB 


003 


FE 


103 


FA 


004 


FD 


104 


F9 


005 


FC 


105 


F8 


006 


FB 


106 


F7 


007 


FA 


107 


F6 


008 


F9 


108 


F5 


009 


F8 


109 


F4 


OOA 


F7 


10A 


F3 


OOB 


F6 


10B 


F2 


OOC 


F5 


IOC 


F1 


OOD 


F4 


10D 


FO 


DOE 


F3 


10E 


EF 


OOF 


F2 


10F 


EE 


010 


F1 


110 


ED 


Oil 


FO 


111 


EC 


012 


EF 


112 


EB 


OEE 


13 


1EE 


OF 


OEF 


12 


1EF 


OE 


OFO 


11 


1F0 


OD 


0F1 


10 


1F1 


OC 


0F2 


OF 


1F2 


OB 


0F3 


OE 


1F3 


OA 


0F4 


OD 


1F4 


09 


0F5 


DC 


1F5 


08 


0F6 


OB 


1F6 


07 


0F7 


OA 


1F7 


06 


0F8 


09 


1F8 


05 


0F9 


08 


1F9 


04 


OFA 


07 


1FA 


03 


OFB 


06 


1FB 


02 


OFC 


05 


1FC 


01 


OFD 


04 


1FD 


(Note 2) 


OFE 


03 


1FE 


(Note 2) 


OFF 


02 


IFF 


(Note 2) 



Notes; 1. The reciprocals of these numbers are loo large to be representecJ in the 
selected format. 
2. The reciprocals of these numbers are too small to be represented in 
normalized IEEE format. 
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RBUS ' 

SBUS ' 



HARDWARE 

LOOK-UP 

TABLE 



H 



R S 

Ani29C325 

F 



FBUS ' 



AF004641 

Figure C1. Adding a Hardware Look-Up Table to the Am29C325 



The look-up table has two sections: a biased exponent look-up 
PROM, and a fraction look-up PROM. The seed-biased 
exponent look-up table is stored in a 512-by-8-bit PROM. This 
table consists of two sections: the DEC format section (which 
occupies addresses OOO-OFF-ie), and the IEEE section 
(which occupies addresses lOO-IFFie- The appropriate 
table will be selected autom atica lly if address line As is wired 
to the Am29C325's IEEE/DEC pin. The equations imple- 
mented by these table sections are: 

DEC table: seed biased exponent 

= 257io -input biased exponent 



IEEE table: 



seed biased exponent 

= 253io -input biased exponent 



Table C1 lists the contents of this PROM. 

The seed fraction look-up table is stored in one or more 
PROMs, the number of PROMs depending on the desired 
accuracy of the seed value. The hardware depicted in Figure 



02 uses two 4K-by-8-bit PROMs to Implement a fraction look- 
up table whose inputs are the 12 MSBs of the input argu- 
ment's fraction. These PROMs output the 16 MSBs of the 
seed's fraction field — the remaining 7 bits of fraction are set 
to 0. The equation implemented in this table is: 

2 

seed fraction- -1 

1 + input fraction 
where the value of the input fraction falls in the range 

< input fraction < 1 

Note that the seed fraction must also be constrained to fall in 
the range 

< seed fraction < 1 

Therefore, if the input fraction is 0, the corresponding seed 
fraction stored in the table must be .1 1 1 ...1 1 1j, not 1 .Og. The 
same seed fraction look-up table may be used for both IEEE 
and DEC formats. Table C2 contains a partial listing for the 
seed fraction look-up table shown in Figure C2. 
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TABLE C2. CONTENTS OF THE SEED FRACTION PROMS 








PROM Outputs (16) 1 


Address (16) 


Value of Input Fraction (10) 


Value of Seed Fraction (10) 


R22-R15 


R14-R7 


000 


0.0 


0.9999999999 (see text) 


FF 


FF 


001 


0.0002441406 


0.9995118370 


FF 


EO 


002 


0.0004882812 


0.9990239150 


FF 


CO 


003 


0.0007324219 


0.9985362280 


FF 


AO 


004 


0.0009765625 


0.9980487790 


FF 


80 


005 


0.0012207031 


0.9975615710 


FF 


60 


006 


0.0014648438 


0.9970745970 


FF 


40 


007 


0.0017089844 


0.9965878630 


FF 


20 


008 


0.0019531250 


0.9961013650 


FF 


00 


009 


0.0021972656 


0.9956151030 


FE 


El 


OOA 


0.0024414063 


0.9951290800 


FE 


CO 


OOB 


0.0026855469 


0.9946432920 


FE 


A1 


OOC 


0.0029296875 


0.9941577400 


FE 


81 


FF6 


0.9975585938 


0.0012221950 


00 


50 


FF7 


0.9978027344 


0.0010998410 


00 


48 


FF8 


0.9980486750 


0.0009775170 


00 


40 


FF9 


0.9982910156 


0.0008552230 


00 


38 


FFA 


0.9985351563 


0.0007329590 


00 


30 


FFB 


0.9987792969 


0.0006107240 


00 


28 


FFC 


0.9990234375 


0.0004885200 


00 


20 


FFD 


0.9992675781 


0.0003663450 


00 


18 


FFE 


0.9995117188 


0.0002442000 


00 


10 


FFF 


0.9997558594 


0.0001220850 


00 


08 




RBUS ' 

• 












y 


'i 


'e 


'12 






SIGN 


BIASED 


12MSBS 




("si) 


EXPONENT 


OF FRACTION 






<R»-''23> 


P!!-"!!) 






1 














Ae Aj-Ao 




An-Ao All-Ao 








Am27SlS 512 x 8 




(2) A1I127S43 4K X 8 








SEED EXPONENT PROM 




SEED FRACTION PROMs 




- 




"V-Oo 




D7-D0 1 D7-D0 




''^ 




's 






'b 


'8 ..„.. 












J'T 




SEED FRACTION 


AF004631 


Figure C2. The Hardware Look-Up Table 


With the hardware look-up table in place, the reciprocal of 3) Load product B*xo into register F. Select the 2 MINUS S 


value B can be calculated with the following series of operation, and select register F as the input to the ALU S 


operations: port (see Figure C3-C). 


4) Load 2-B*xo into register F. Select the R TIMES S 


1 ) Place B on both the R and S buses. The 2 : 1 multiplexer at jion and select register F as the input to the ALU S 


the output of the hardware look-up table should select the pp^ /ggg piny^e C3-D) 


output of the look-up table (see Figure G3-A). 


5) Load the value xi (xi = xo(2 - B'xq)) into registers R and F. 


2) Load the seed value xq Into register R and load B into ^^^"^ ^^ ^ ^"^^^ S operation (see Figure C3-E). 


register S. Select the R TIMES S operation (see Figure 6) Repeat steps 3 through 5 until the result has the accuracy 


C3-B). desired. 
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BUSS B — - 



BUSR B 



SEED 
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TABLE 
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Xo 



2 :1 
MUX 



Sn-S;9 
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MUX 



PORT 
R 



PORT 
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PORTF 
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BUSF- 



Ffl-Fa 



Figure C3-A. Data Flow for Step 1 of the Reciprocal Procedure 
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MUX 
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S0-S31 




































1 

2: 1 

MUX 




REGISTER S 

[B] 
















1 






r ' i 








REGISTER R 

[Xo] 




OJ 2:1 1 
1 MUX 






1 


1 

1 












1 




~I r- 




















PORT PORT 
R S 

ALU 
PORTF 








B-Xo 












REGISTER F 












Am 


I29C32S 








F0-F3I 




DF006221 

Figure C3-B. Data Flow for Step 2 of the Reciprocal Procedure 
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] u 





1 






1 


-■ 





2: 1 
MUX 


1 
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S 



ALU 
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Figure C3-C. Data Flow for Step 3 of the Reciprocal Procedure 
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Figure C3-D. Data Flow for Step 4 of the Reciprocal Procedure 
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Figure C3-E. Data Flow for Step 5 of the Reciprocal Procedure 
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A tabular description of tlie operations above is given in Table 
C3. The following examples, performed in IEEE format, 
illustrate the process. 

Example 1: 

Find the reciprocal of 25.3. 

Solution; The IEEE floating-point representation for 25.3 is 
4ICA666616. The reciprocal process is begun by 
feeding this value to both the seed look-up table 



and port S. The look-up table produces the value 
.0395278910 (3D21E800i6). The reciprocal is 
evaluated using the procedure described above; 
register values for each step are given in Table C4. 
The expected result, to the precision of the float- 
ing-point v^ord, is .0395256910 (3D21E5B1i6)- In 
this case the expected result is produced after the 
first iteration. All subsequent iterations produce the 
same result, and are therefore unnecessary. 



TABLE C3. SEQUENCE OF EVENTS FOR EVALUATING RECIPROCALS 



Clock 
Cycle 



I0-I2 



R TIMES S 



2 MINUS S 



R TIMES S 



R TIMES S 



2 MINUS S 



R TIMES S 



R TIMES S 



ENR 



ENS 



ENF 



Register R 



Xo 



Xo 



Xo 



Xi(=Xo(2-B.Xo)) 



Xl 



Xl 



X2(=Xi(2-B-Xi)) 



Register S 



■' DON'T CARE 



Register F 



B-Xo 



2-B*Xo 



Xi(=Xo(2-B.Xo)) 



B«Xi 



2-B«Xi 



X2(=Xi(2-B.Xi)) 



First 
iteration 



Second 
iteration 



TABLE C4. INPUT BUS AND REGISTER VALUES FOR EXAMPLE 1 



Clock 
Cycle 


R Input 


S Input 


Register R 


Register S 


Register F 


1 ■ 


3D21 E800 
(.03952789) 


41CA666616 
(25.3) 


- 


- 


- 


2 


- 


- 


3D21E80016 
(.03952789) 


4ICA666616 
(25.3) 


- 


3 


- 


- 


3D21E80016 
(.03952789) 


4ICA666616 
(25.3) 


3F8001D316 
(1.0000556) 


4 


- 


- 


3D21E80016 
(.03952789) 


4ICA666616 

(25.3) 


3F7FFC5Ai6 
(.99984419) 


5 


~ 


- 


3D21E5B11S 
(.03952569) 


4ICA666616 
(25.3) 


3D21E5B1i6 
(.03952569) 


6 


- 


- 


3D21E5B116 
(.03952569) 


4ICA666616 
(25.3) 


3F7FFFFFi6 
(.99999994) 


7 


- 


- 


3D21E5B116 
(.03952569) 


4ICA666616 
(25.3) 


3F80000016 
(1.0) 


8 


- 


- 


3D21E5B116 
(.03952569) 


4ICA666616 
(25.3) 


3D21E5B116 
(.03952569) 



Result of first 
iteration 



Result of second 
iteration 
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Example 2: 

Find the reciprocal of -0.4725. 

Solution: Ttie IEEE floating-point representation for -0.4725 
Is 6EF1EB85i6. The reciprocal process Is begun 
by feeding this value to both the seed lool<-up table 
and port S. The look-up table produces the value 
-2.1162109410 (C0077000i6). The reciprocal Is 



evaluated using the procedure described above; 
register values for each step are given in Table C5. 
The expected result, to the precision of the float- 
ing-point word, is -2.11640210 (C0077322i6)- In 
this case the expected result Is produced after the 
first iteration. All subsequent iterations produce the 
same result, and are therefore unnecessary. 



TABLE C5. INPUT BUS AND REGISTER VALUES FOR EXAMPLE 2 



Clock 
Cycle 


R Input 


S Input 


Register R 


Register S 


Register F 


1 


C007700016 
(-2.1162109) 


BEF1EB8516 
(-0.4725) 


- 


- 


- 


2 


- 


- 


C007700016 
(-2.1162109) 


BEF1EB8516 
(-0.4725) 


- 


3 


- 


- 


G007700016 
(-2.1162109) 


BEF1EB8516 
(-0.4725) 


3F7FFA1416 
(0.99990963) 


4 


- 


- 


C007700016 
(-2.1162109) 


BEF1EB8516 
(-0.4725) 


3F8002F616 
(1.0000904) 


5 


- 


- 


C007732216 
(-2.116402) 


BEF1EB8516 
(-0.4725) 


C007732216 
(-2.116402) 


6 


- 


- 


C007732216 
(-2.116402) 


BEF1EB8516 
(-0.4725) 


3F80000016 
(1.0) 


7 


- 


- 


C007732216 
(-2.116402) 


BEF1EB8516 
(-0.4725) 


3F800000i6 
(1.0) 


8 


- 


- 


C007732216 
(-2.116402) 


BEF1EB8516 
(-0.4725) 


C0077322i6 
(-2.116402) 



Result of first 
iteration 



Result of second 
iteration 
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APPENDIX D 

SUMMARY OF FLAG OPERATION 

Tables D1, D2, and D3 summarize flag operation for the IEEE 
mode, the DEC mode, and for the lEEE-TO-DEC and DEC-TO- 
IEEE operations. 





TABLE D1. FLAG SUMMARY FOR IEEE MODE 








Operation 


Condition(s) 


INV 


OVF 


UNF 


INE 


ZER 


NAN 


Any operation 




H 


L 


L 


L 


L 


H 


listed In the 
















IEEE Invalid 
















Operations Table 
















R PLUS S 


Input operands are finite 


L 


H 


L 


H 


L 


L 


R MINUS S 


1 rounded result |>2^2^ 














R TIMES S 
















2 MINUS S 
















R PLUS S 
















R MINUS S 


0<|rounded result | < 2"'' ^^ 


L 


L 


H 


H 


H 


L 


R TIMES S 
















R PLUS S 


Final result does not equal 


L 


• 


• 


H 


. 


L 


R MINUS S 


infinitely precise result 














R TIMES S 
















2 MINUS S 
















INT-TO-FP 
















FP-TO-INT 
















R PLUS S 


Final result is zero 


L 


L 


. 


. 


H 


L 


R MINUS S 
















R TIMES S 
















2 MINUS S 
















INT-TO-FP 
















FP-TO-INT . 
















R PLUS S 


Final result is a NAN 


* 


L 


L 


L 


L 


H 


R MINUS S 
















R TIMES S 
















2 MINUS S 
















FP-TO-INT 
















Notes: INV - Invalid opera 


ion flag 















OVF = Overflow flag 
UNF - Underflow flag 
INE = Inexact flag 
ZER = Zero flag 
NAN " NAN flag 
L = LOW 
H - HIGH 
* = State of flag 
depends on tfie 
Input operands 
and the operation 
performed 
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TABLE D2. FLAG SUMMARY FOR DEC MODE 



Operation 


Condition(s) 


INV 


OVF 


ONF 


INE 


ZER 


NAN 


FP-TO-INT 


Rounded result > 2^1-1 
or rounded result < -2^^ 


H 


L 


L 


L 


L 


H 


FP-TO-INT 


Input is a DEC-reserved 
operand 


L 


L 


L 


L 


L 


H 


R PLUS S 
R MINUS S 
R TIMES S 
2 MINUS S 


1 Rounded result 1> 2^ ^7 


L 


H 


L 


H 


L 


H 


R PLUS S 
R MINUS S 
R TIMES S 


0<lrounded result |< 2-^28 


L 


L 


H 


H 


H 


L 


R PLUS S 
R MINUS S 
R TIMES S 
2 MIMUS S 
INT-TO-FP 
FP-TO-INT 


Final result does not equal 
infinitely precise result 


L 






H 


* 




R PLUS S 
R MINUS S 
R TIMES S 
2 MINUS S 
INT-TO-FP 
FP-TO-INT 


Final result is zero 


L 


L 




. 


H 


L 


R PLUS S 
R MINUS S 
R TIMES S 
2 MINUS S 
FP-TO-INT 


Final result is a DEC-resen/ed 
operand 






L 


L 


L 


H 



Notes: INV = tnvalicJ operation flag 
OVF = Oertlow flag 
UNF = Underflow flag 
INE = Inexact flag 
ZER - Zero flag 
NAN = NAN flag 
L = LOW 



H = HIGH 

* = State of flag 
depends on tfie 
input operands 
and ttie operation 
performed 



TABLE D3. FLAG SUMMARY FOR lEEE-TO-DEC AND DEC-TO-IEEE CONVERSIONS 



Operation 


Condition(s) 


INV 


OVF 


UNF 


INE 


ZER 


NAN 


IEEE-TO-DEC 


Input is a NAN 


H 


L 


L 


L 


L 


H 


lEEE-TO-DEC 


1 Input i> 2^2'' 


L 


H 


L 


H 


L 


H 


DEG-TO-IEEE 


Input is a DEC-reserved operand 


H 


L 


L 


L 


L 


H 


DEC-TO-IEEE 


0<irounded result | < 2" '' ^^ 


L 


L 


H 


H 


H 


L 


DEC-TO-IEEE 
lEEE-TO-DEC 


Final result is zero 


L 


L 






H 


L 



Notes: INV = Invalid operation flag 
OVF = Oveniow flag 
UNF - Underflow flag 
INE = Inexact flag 
ZER - Zero flag 
NAN = NAN flag 
L-LOW 



■HIGH 

= State of flag 
depends on the 
input operands 
and the operation 
performed 
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ABSOLUTE MAXIMUM RATINGS 

Storage Temperature -65 to +150°C Comme 

Case Temperature Under Bias -55 to +125°C Tem[ 

Supply Voltage to Ground Potential Supp 

Continuous -0.3 to +7.0 V Militarv* 

DC Voltage Applied to Outputs -^JJ^^ 


OPERATING RANGES 

clal (C) Devices 

jerature, Case (Ta) tc 

Iv Voltaae (Vro) -1-4 75 to 


■l-70''C 

^ R P.S V 


(M) Devices 
Derature (Ta) 


-55 to -HaS'C 


for HIGH OutDut State -0.3 V to + 


Vcc + 0.3 V Supp 

Vf-f- + 0.3 V ^ 


Iv Voltaae C^rr) -1- 






)ltage -0.3 to 




DC Input Vc 
DC Output 
DC Input Ci 

Stresses ab 
RATINGS m 
at or above 
maximum re 
reliability. 

DC CHAI 

Subgroups 






:;urrent, into LOW Outputs 30 mA Operating ranges define those'<'^its between which the 

jrrent -10 to -HO mA functionality of the device is^^mteed. 

ove those listed under ABSOLUTE MAXIMUM ,^.^. ^^^ ^^.^^ t4e?%fA = +25»C, -H25°C, and 
ay cause pennanent device failure. Functionality -55°c »%'%«*- 
these limits is not implied. Exposure to absolute ' jTy "''J^I "* 
tings for extended periods may affect device ■*'*%; "* 

-?^ %' 

^ACTERISTICS over operating range unless otherwise speOikft (for APL Products, Group A, 
1, 2, 3 are tested unless otherwise noted) ""_ ',"*' 


Parameter 
Symbol 


Parameter 
Description 


Test Conilltibtis (Note 1) 


Min. 


Max. 


Unit 


VOH 


Output HIGH Voltage 


Vcc = Mm. |f' ,• 
VjN = V|L or j4t;: 


Ion "^ 0-4 mA 


2.4 




V 


Vol 


Ouput LOW Voltage 


Vcc = Mitts.,,, *.' '" 


lOL = 8 mA for 
Y-BUS, 4 mA for 
All Other Pins 




0.5 


V 


V|H 


Guaranteed Input Logical 
HIGH Voltage (Note 2) 


0^^ 


2.0 




V 


ViL 


Guaranteed Input Logical .f^' 
LOW Voltage (Note 2) ,^^\ 






0.8 


V 


IlL 


Input LOW Current 


Vcc = Max. 
VjN = 0.5 V 




-10 


ma 


l|H 


Input HIGH Current \ - 


Vcc = Max. 

V|N = VcC-0.5 V 




10 


ma 


bZH 


Off-State (HIGffitopfiiance) 
Output Curn|iu5^#' 


Vcc = Max., Vq = 2.4 V 




10 


mA 


lOZL 


Off-State (Hli|' Impedance) 
Output Current 


Vcc = Max., Vq = 0.5 V 




-10 


mA 


Ice 


Static Power Supply Current 


Vcc = Max., V|N = Vcc or GND, Iq = /uA 


Ice = 30 mA 
(COM and MIL) 


CPD 


Power Dissipation Capacitance 
(Note 3) 


Vcc = 5.0 V. Ta = 25°C, No Load 


pF Typical 


Notes: 1. Vcc conditions stiown as Min. or Max. refer to the commercial and military Vcc limits. 

2, These input levels provide zero-noise immunity and should only be statically tested in a noise-free environment (not functionally tested). 

3. CpD determines the no-load dynamic current consumption: 

Ice (Total) - Ice (Static) + CpD Voc f. where f is the switching frequency of the majority of the Internal nodes, normally one-half of 
the clock frequency. 
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SWITCHING CHARACTERISTICS over COMMERCIAL operating range 


No. 


Parameter 
Symbol 


Parameter 
Description 


Test 
Conditions 


29C325 


29C32S-1 


29C325-2 


Unit 


Min. 


Max. 


Min. 


Max. 


Min. 


Max. 


1 


lASC 


Clocked Add, Subtract Time (R 
PLUS S, R MINUS S, 2 MINUS S) 






130 




98 




78 


ns 


2 


tMC 


Clocked Multiply Time (R TIMES S) 




130 




98 




78 


ns 


3 


<CC 


Ckicked Conversion Time (INT-TO- 
FP, FP-TO-INT, lEEE-TO-DEC, DEC- 
TO-IEEE) 




130 




98 




78 


ns 


4 


•asuc 


Unclocked Add, Subtract Time (R, S 
to F, Flags) for R PLUS S, R 
ItfllNUS S,and 2 MINUS S 
Instructions 


FTo - HIGH 
FTi - HIGH 




145 




126 




100 


ns 


5 


•muc 


Unclocked Multiply Time (R, S to F, 
Flags) for H TIMES S Instruction 




145 


'< 






100 


ns 


6 


tcuc 


Unclocked Conversion Time (R, S to 
F, Flags) for INT-TO-FP, FP-TO- 
iNT, IEEE- TO-DEC and DEC-TO- 
IEEE Instructions 




145 




■f'^ 




100 


ns 


7 


IPWH 


Clock Pulse Width HIGH 




20 


4 


\ 




15 




ns 


6 


'PWL 


Clock Pulse Width LOW 


20 


,#'» 


PSu15 




15 




ns 


9 


tpDOFI 


Clock to F0-F31 and Flag Outputs 


FTo " LOW 
FT, = HIGH 




«P*i 


tp' 


118 




94 


ns 


10 


tPD0F2 


FT, = LOW 


4 


. w 




20 




16 


ns 


11 


'PZL 


OE Enable Time 


Z to LOW 




■^ 


:%«3* 




20 




16 


ns 


12 


tpZH 


Z to HIGH, 


.'«.» 


tl# 




20 




16 


ns 


13 


tPLZ 


OE Disable Time 


LOW to Z 




- '% 


|»'"23 




20 




16 


ns 


14 


tPHZ 


HIGH to Z 


a** 


23 




20 




16 


ns 


15 


<PZL16 


Qock t to F0-F15 
Enable, 16-Bit I/O 
Mode 


Z to LOW 


S16/32 = HIGH 
ONEBUS = LOW 


,s 


27 




22 




18 


ns 


16 


tpzHie 


Z to HIGH 




27 




22 




18 


ns 


'17 


tPLZ16 


Clock t to F0-F15 


LOW to Z 


' 


Fi- 


29 




22 




18 


ns 


18 


tPHZ16 


Mode 


HIGH TO Z 




29 




22 




18 


ns 


19 


IPZLie 


Clock ) to F16-F31 
Enable, 16-Bit I/O 
Mode 


Z to LOW 


'""^ 




30 




22 




18 


ns 


20 


tpzHie 


Z to HIGH 




30 




22 




18 


ns 


21 


tPLZ16 


Clock t to F16-F31 
Disable.1 6-Bit I/O 
Mode 


LOW to Z 




25 




21 




17 


ns 


22 


tPHZ16 


HIGH to Z 

.,1' .J 




26 




21 




17 


ns 


23 


ISCE 


Register Clock Enable Setup Tl^pJ' 


F% - LOW 
Ft, -LOW 


15 




15 




15 




ns 


24 


tHCE 


Register Clock Enable Hol#time.a, 


FTo = LOW 
FTi - LOW 












D 




ns 


25 


<SD1 


Ro-Rsi. S0-S31 Setup T]rae'(rWfe 

1) ■■■:■••::. 10-;- 


FTo - LOW 


15 




15 




15 




ns 


26 


tHD1 


Ro-R31, So-S3i Hold Time (Note 1) 

















ns 


27 


tSD2 


R0-R31. So - Sst Setup TiWe (Note 

1) , *<;..»'■"■ 


FTo = HIGH 
FTi - LOW 


136 




118 




118 




ns 


28 


'HD2 


R0-R31, So-%1 Ho|.Time (Note 1) 

















ns 


29 


tSI02 


Iq - 12 InsMgg^^elect Setup Time 


FT for 

Destination 
Register = LOW 


136 




118 




118 




ns 


30 


tHI02 


lo - 12 f«ippn Select Hold Time 

















ns 


31 


tPDI02 


lo - il'-ill^rucBn Select to F0-F31, 


FT, - HIGH 




136 




lie 




118 


ns 


32 


tSI3 


r^SSp6|lnput Select Setup Time 


FTi - LOW 


136 




118 


t 


118 




ns 


33 


tHI3 


I3 P^ S Input Select Hold Time 

















ns 


34 


tSI4 


I4 Register R Input Select Setup 
Time (Note 1) 


FTo - LOW 


15 




15 




15 




ns 


35 


tHI4 


I4 Register R Input Select Hold 

Time 

(Note 1) 

















ns 


36 


tSRM 


Round Mode Select Setup Time 


FT for 

Destination 
Register = LOW 


50 




46 




46 




ns 


37 


<HBM 


Round Mode Select Hold Time 

















ns 


36 


tPRF 


Round Mode Select to F0-F31, Flags 


FT, = HIGH 




64 




58 




58 


ns 


Notes: 1 . See timing diagram for desired mode of operation to determine clocli edge to which these setup and hold limes apply. 

1 
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SWITCHING CHARACTERISTICS over MILITARY operating range (for APL Products, Group A, Subgroups 
9, 10, 11 are tested unless otherwise noted) 


No. 


Parameter 
Symbol 


Parameter 
Description 


Test 
Conditions 


29C325 


Unit 


Min. 


Max. 


1 


•asc 


Clocked Add, Subtract Time (R PLUS S, 
R MINUS S, 2 MINUS S) 






145 


ns 


2 


•mc 


Clocked Multiply Time (R TIMES S) 




145 


ns 


3 


tec 


Clocked Conversion Time (INT-TO-FP, 
FP-TO-INT, lEEE-TO-DEC, DEC-TO-IEEE) 




145 


ns 


4 


Usuc 


Unclocked Add, Subtract Time (R, S to F, 
Flags) for R PLUS S, R MINUS S, 
and 2 MINUS 8 Instmctions 






160 


ns 


5 


«MUC 


Unclocked Multiply Time (R, S to F, Flags) 
for R TllvlES S Instruction 


FTo - HIGH ,4* 
FT, - HIGH '^^ 

.,f -Is 


'1^- 


160 


ns 


6 


tcuc 


Unclocked Conversion Time (R, S to F, 
Flags) for INT-TO-FP, FP-TO-INT, lEEE- 
TO-DEC and DEC-TO-IEEE Instmctions 


1- 


160 


ns 


7 


tpWH 


Clock Pulse Width HIGH 


%,- 


20 




ns 


8 


tpWL 


Clock Pulse Width LOW 


20 




ns 


9 


tpDOFI 


Clock to F0-F31 and Flag Outputs 


^?:^ 




152 


ns 


10 


tpD0F2 


ftS»*' 




30 


ns 


11 


IPZL 


OE Enable Time 


Z to LOW 


^:'-% 




26 


ns 


12 


tpZH 


Z to HIGH < 


*#■** 




26 


ns 


13 


tpLZ 


OE Disable Time 


LOW to Z JSS 


ff 




26 


ns 


14 


tPHZ 


HIGH to Z g 




26 


ns 


15 


tpzLie 


Clock t to F0-F15 Enable, 16- 
Bit I/O Mode 


z to Logf ,«5a 


bil6/32-HIGH 
ONEBUS - LOW 




30 


ns 


16 


tpzHie 


z to Hf%r 




30 


ns 


17 


tPLZie 


Clock Mo F0-F15 Disable, 
16-Bit I/O Mode 


LOW*i%i,''-% 






33 


ns 


18 


tPHZie 


nm9&m 




33 


ns 


19 


'PZL16 


Clock i to F16-F31 Enable, 
16-Bit I/O Mode 


'm^m 


S16/32 = HIGH 
ONEBUS - LOW 




34 


ns 


20 


tpzHie 


z to fen 




34 


ns 


21 


tpLzie 


Clock f to F16-F31 0f , 
Disable, 16-Bit I/O Mode '^i«A 


ti-Qil to z 




28 


ns 


22- 


tpHzie 


.^H to Z 




28 


ns 


23 


tSOE 


Register Clock Enable ^tup T,i^e 


FTo = LOW 
FTi - LOW 


16 




ns 


24 


tHCE 


Register Clock Enable ftajil'fime 

iff'-.**-* 


FTo = LOW 
FTi = LOW 







ns 


25 


ISDI 


R0-R31. Sq-^ji i^. Time (Note 1) 


FTo = LOW 


15 




ns 


26 


tHDI 


Fio-Fi31. So-^fftlfl Time (Note 1) 







ns 


27 


<SD2 


Ro-F!3i. %-S3pSetup Time (Note 1) 


FTo = HIGH 
FTi = LOW 


152 




ns 


28 


tHD2 


Ro-Ri^^iiS3l'Hold Time (Note 1) 


-30 




ns 


29 


tSI02 


lo-^^i^ructiSn Select Setup Time 


FT for Destination 
Register - LOW 


152 




ns 


30 


tHIOS 


lo#i lnsi|ction Select Hold Time 







ns 


31 


tPDI02 


Jp-r2*i^ction Select to F0-F31, Flags 


FT, = HIGH 




152 


ns 


32 


tSI3 


V*S*S Input Select Setup Time 


FTi - LOW 


152 




ns 


33 


'HI3 


iJ'Rort S Input Select Hold Time 







ns 


34 


tSI4 


I4 Register R Input Select Setup Time (Note 1) 


FTo = LOW 


15 




ns 


35 


tHI4 


I4 Register R Input Select Hold Time (Note 1) 







ns 


36 


tSRM 


Round Mode Select Setup Time 


FT for Destination 
Register = LOW 


65 




ns 


37 


tHRM 


Round Mode Select Hold Time 







ns 


38 


tpRF 


Round Mode Select to F0-F31, Flags 


FTi - HIGH 




80 


ns 


Notes: 1. See timing diagram for desired mode of operation to determind clocl< edge to wliicli these setup and hold times apply. 
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SWITCHING TEST CIRCUITS 




VOUT ' 



R, = 6K > ^ C| 



-^ 



TC001084 



R2 = 



5.0-Vbe-Vol 
R1=I0L- — 



A. Three-State Outputs 



2.4 V 

lOH 



5.0-Vbe-Vol 

Ri = Vol 
B. Normal Outputs 



Notes; 1 . Cl = 50 pF includes scope probe, wiring, and stray capacitances without device in test fixture. 

2. Si, S2, S3 are closed during function tests and all AC tests except output enable tests. 

3. Si and S3 are closed while 82 is open for tpzH test. 
Si and 82 are closed while 83 is open for tpzL test. 

4. Cl = 5.0 pF for output disable tests. 
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SWITCHING TEST WAVEFORMS 



DATA 
mPOT 




k.-H 




• 3 V 
1.5 V 

■ V 

■ 3 V 



V 

WFR02970 



Notes: 1. Diagram shown for HIGH data only. 

Output transition may be opposite sense. 
2. Cross hatched area is don't care 
condition. 

Set-Up, Hold, and Release Times 



7f=^ 



1^=^ 



\=l£ 



Propagation Delay 



• 3 V 
■ 1.5 V 
- V 



LOW HIGH-LOW 
PULSE " 



HIGH-LOW HIGH _ 
PULSE 



OUTPUT 

NORMALLY 

LOW 



Pulse Width 



Enable 



Disable 



.3 OPEN I \ 



-1.5 V 
^OL 



OUTPUT 
MORMALLY 



— ' V 



^-^F^ 



WFR02660 
Notes: 1. Diagram shown for Input Control Enable- 
LOW and Input Control Disable-HIGH. 
2. Si, S2 and S3 of Load Circuit are closed 
except where shown. 

Enable and Disable Times 
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SWITCHING WAVEFORMS 
KEY TO SWITCHING WAVEFORMS 



MAWEFORM IWUTS 



WILL BE 

CHANGING 
FROM H TO L 



DON'T CARE; CHANCING, 

ANV CHANGE STATE 

PERMITTED UNKNCm 



CENTER 
(WESNOT LINEISHIGH 

APPLY IMPEDANCE 

■VDFF" STATE 




Am<xxM'>::A.;<xxy:<,/<'! 



X 



X 



nr—^ 



. ^ 






X 



wm^??5Py;ap^w ^ v?xvM 



X 



]>C 



X 



Clocked Operation: FTq = LOW 
FTi = LOW 
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SWITCHING WAVEFORMS (Cont'd.) 






CLK 














®- 


^ 


-®^ 








"W« 


m^mmMM 


D^ 


@- 








'K^i 




Y VALID 


3^— 










-@ 






S0-S3, 




X 


in 






1 


^ 




- ® 








"0-12 


X 


f 


D(n 










- ■. (^ 








I3 


1 


Z>CI 










- ® 


® 




RNDg-RND, 


)( 


^^ 


1 

Clocked Operation: FTq = HIGH 
FTi = LOW 


I 

WF023770 






^ 






CLK 










1 


- 


@-l 


- S- 








.S^^^ 


S 

/ 


"immmmc 


X 


















"^wm 


LTAUO 


)mmmm^ 


VALID 


X 






®i 


- ®- 








t-^^:i 


)mmmmm 


K 


p^ 




1 


















lo-lj 


i 




^ 




- 


8- 


-S-] 








■•Z3^: 


X 




X 






1 


















RNDo-RND, 


Y 
A 




X 




1 

Clocked Operation: FTq = LOW 
FTi = HIGH 






WF023780 
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SWITCHING WAVEFORMS (Cont'd] 






® 



RNDg-RNO, 



FLAGS 



Flow-Through Operation (FTq = HIGH, FTi = HIGH) 



CLK- 



INPUT DATA 
BUS 



■@ — 



SOATA 



/' 



RDATA 



32-Bit, Single-Input Bus Mode 
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SWITCHING WAVEFORMS (Cont'd.) 











y 


^® 




^ 


.® 




^®^ 


/ 




/ 












— . '^mmi - »< - t 


®— 




— ® 


.».s mmmc mim 




® 




® 




® 

Hl-Z 




Fo-FiS J 


f VALID 




< 






@ 
® 




SD 




@ 


i 


Hl-Z 


( ^*^'° X 






WF023810 



Note 1. I4 has special setup and hold time requirements In this mode. All other control signals have timing 
requirements as shown in the diagram "Clocked operation, FTo = LOW, FTi = LOW." 

16-Bit, Two-Input Bus Mode 
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INPUT/OUTPUT CIRCUIT DIAGRAMS 



O 



V, 




DRIVEN INPUT 






!«. 






p 




J 












1 . 




1 








N 



OUTPUT 



'oh 



4-132 



Am29C327 

CMOS Double-Precision Floating-Point Processor 



n 



ADVANCE INFORMATION 



DISTINCTIVE CHARACTERISTICS 



High-performance double-precision floating-point pro- 
cessor 

Comprehensive floating-point and integer instruction 
sets 

Single VLSI device performs single-, double-, and 
mixed-precision operations 

Performs conversions between precisions and between 
data formats 
Compatible with industry-standard floating-point formats 

- IEEE 754 format 

- DEC F, DEC D, and DEC G formats 

- IBM system/370 format 



Exact IEEE compliance for denormalized numbers with 
no speed penalty 

Eight-deep register file for intermediate results and on- 
chip 64-bit data path facilitates compound operations; 
e.g., Newton-Raphson division, sum-of-products, and 
transcendentals 

Supports pipelined or flow-through operation 
Fabricated with Advanced Micro Devices' 1.2 micron 
CMOS process 



> 

3 

O 

u 



SIMPLIFIED SYSTEM DIAGRAM 



R-Port 



S-Port 



Operand Router 



Constants 



_I 



464 



ALU Input Multiplexer 



V 



V 



Floating-Point & Integer 
ALU 



E 



vt 



64 



F-Register 



J. 



Output Multiplexer 



/ 32 

F-Port 




\ R-Register \ S-Register \ Reg. File 



♦♦♦tntt 







BD007470 



DEC F. DEC D, DEC G, and VAX are trademarks of the Digital Equipment Corporation. 
IBM sy5tem/370 is a trademark of International Business Machines, Inc. 
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Publication # Rev. Amendment 

09418 B /O 
Issue Date: November 1987 



GENERAL DESCRIPTION 



The Am29C327 double-precision floating-point processor is a 
single VLSI device that implements an extensive floating-point 
and Integer instruction set, and can perform single-, double- or 
mixed-precision operations. The three most popular floating- 
point formats - IEEE, DEC, and IBM -are supported. IEEE 
operations comply with Standard 754, with direct implementa- 
tion of special features such as gradual underflow and trap 
handling. 

The Am29C327 consists of a 64-bit ALU, a 64-bit datapath, 
and a control unit. The ALU has three data input ports, and 
can perform compound operations of the form (A * B) + C. 
The data path comprises two 64-bit input operand registers, 
an 8-by-64-bit register file for storage of intermediate results, 
three operand-selection multiplexers that provide for orthogo- 



nal selection of input operands, a 64-bit output register, and an 
output multiplexer that allows access to the 32 MSBs or 32 
LSBs of the result data. Control signals determine the opera- 
tion to be performed, the source of operands, operand 
precision, rounding mode, and other aspects of device opera- 
tion. 

Operations can be performed in either of two modes: flow- 
through or pipelined. In the flow-through mode, the ALU is 
completely combinatorial; this mode is best suited for scalar 
operations. Pipelined mode divides the ALU into one or two 
pipelined stages, for use in vector operations, as often found 
In graphics or signal processing. 

Fabricated with AMD's 1.2 micron technology, the Am29C327 
Is housed in a 169-lead pin-grid-array (PGA) package. 



This document contai™ Inlormalion on a pfoducl under davelopmom at Advanced Micro Devices, Inc. The inlormaiion is intended to 
neip you to evaluate It-is product AMD resBrvss the right to change or discontinue work on this proposed product without notice. 
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RELATED AMD PRODUCTS 



Part No. 


Description 


Am29C10A 


CMOS Microprogram Controller 


Am29C116 


CMOS Minimum Power 16-Bit 
Microprocessor 


Ann29C117 


CMOS Two-Port 16-Bit 
Microprocessor 


Am29PL141 


Field-Programmable Controller (FPC) 


Am29C323 


CMOS 32-Bit Parallel Multiplier 


Am29C325 


CMOS 32-Bit Floating-Point 
Processor 


Am29C331 


CMOS 16-Bit Microprogram 
Sequencer 


Am29C332 


CMOS 32-Bit Arithmetic Logic Unit 


Am29C334 


CMOS Four-Port Dual-Access 
Register File 



CONNECTION DIAGRAM 

169-Lead PGA* 

Bottom View 



ABCDEFGHJKLIMNPRTU 



1 


1 ® ® ® 


2 


® ® ® 


3 


® ® ® 


4 


® ® ® 


5 


® ® ® 


6 


® ® ® 


7 


® ® ® 


8 


® ® ® 


9 


® ® ® 


10 


® ® ® 


11 


® ® ® 


12 


® ® ® 


13 


® ® ® 


14 


® ® ® 


15 


® ® ® 


16 


® ® ® 


17 


® ® ® 



® ® 
® ® 
® ® 
® ** 



® ® ® ® ® 
® ® ® ® ® 
® ® ® ® ® 



® ® ® 
® ® ® 
® ® ® 



® ® 
® ® 
® ® 



® ® ® ® ® 
® ® ® ® ® 
® ® ® ® ® 



® ® ® 
® ® ® 
® ® ® 



® ® 

® ® 

® ® 

® 

® 

® 

® 

® 

® 

® 

® 

® 

® 

® 

® ® 

® ® 

® ® 



® % 
® ® 
® ® 
® ® 
® .® 
® ® 
® ® 
® ® 
® ® 
® ® 
® ® 
® ® 
® ® 
® ® 
® ® 
® ® 
® m 



CD009761 



*Pinout observed from pin side of package. 
"Alignment pin (not connected internally). 
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PIN DESIGNATIONS 
(Sorted by Pin No.) 


PIN NO. 


PIN NAME 


PIN NO. 


PIN NAME 


PIN NO. 


PIN NAME 


PIN NO. 


PIN NAME 


A-1 




C-9 




J-15 




R-10 




A-2 




C-10 




J-16 




R-11 




A-3 




C-11 




J-17 




R-1 2 




A-4 




C-12 




K-1 




R-13 




A-5 




C-13 




K-2 




R-1 4 




A-6 




C-14 




K-3 




R-1 5 




A-7 




C-15 




K-15 




R-16 




A-8 




C-16 




K-16 




R-17 




A-9 




C-17 




K-17 




T-1 




A-10 




D-1 




L-1 




T-2 




A-11 




D-2 




L-2 ■ ' . ■ 


T-3 




A-1 2 




D-3 




L-3 


4 •^' i 


T-4 




A-13 




D-15 




L-15 


■^^' ' ■^ 


T-5 




A-1 4 




D-16 




L-16 




T-6 




A-15 




D-17 




L-1 7 




T-7 




A-16 




E-1 




M-1 


%, 


T-8 




A-1 7 




E-2 




M-2 




T-9 




B-1 




E-3 




M4 




T-10 




B-2 




E-15 




M-15 




T-11 




B-3 




E-16 




M-16 




T-12 




B-4 




E-17 


,.*- 


, •*-" 




T-13 




B-5 




F-1 


, »-»x,r^ 


N-1 




T-14 




B-6 




F-2 


„., ^H 


N-2 




T-15 




B-7 




F-3 p* 


N-3 




T-16 




B-8 




F-15 1 '- 


N-1 5 




T-1 7 




B-9 




F-1 6 


.;■:# , 


N-16 




U-1 




B-10 




F-17 


.,,"V 


N-17 




U-2 




B-11 




G-1 


# 


P-1 




U-3 




B-12 




G-2 


p 


P-2 




U-4 




B-13 




*3% 




P-3 




U-5 




B-14 




tt^#' 




P-15 




U-6 




B-15 


:# 


^'^ie 




P-16 




M-7 




B-1 6 




G-1 7 




P-17 




U-8 




B-17 




H-1 




R-1 




U-9 




C-1 




H-2 




R-2 




U-10 




C-2 




H-3 




R-3 




U-11 




C-3 




H-1 5 




R-4 




u-1 2 




C-4 




H-16 




R-5 




U-13 




C-5 




H-1 7 




R-6 




U-14 




C-6 




J-1 




R-7 




U-15 




C-7 




J-2 




R-8 




U-16 




C-8 




J-3 




R-9 




U-17 
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LOGIC SYMBOL 



^ 



7^ 



TO 



¥> 



S/DR 

s/5s 

S/DF 
OLK 
ENR 

ENS 

ENF 

BiRF 

ENJ 

OEF 

OES 

RFSELg-RFSELj 

PSEL0-PSEL3 
QSEL0-QSEL3 

TSEL0-TSEL3 
FSEL 

'0''l3 

RMo-RMj 

SLAVE 



SIGN 

FLAeii-FU«3$ 

MSERR 



S^ 



?> 



LS003081 



ORDERING INFORMATION 
Standard Products 



AMD standard products are available in several packages and operating ranges. The order numt)er (Valid Combination) Is formed by 
a combination of : a. Device Number 

b. Speed Option (if applicable) 

c. Package Type 

d. Temperature Range 

e. Optional Processing 



AM29C327 



DEVICE NUIMBER/DESCRIPTION 

Am29C327 

Double-Precision Floating-Point Processor 



Valid Combinations 



GC, GCB 



-e. OPTIONAL PROCESSING 

Blank - Standard processing 
B = Bunn-in 



- d. TEMPERATURE RANGE 

C- Commercial (0 to -f70°C) 

- 0. PACKAGE TYPE 

G = 169-Lead Pin Grid Array without Heatsink 
(CGX169) 

b. SPEED OPTION 

Not Applicable 



Valid Combinations 

Valid Combinations list configurations planned to be 
supported in volume for this device. Consult the local AMD 
sales office to confirm availability of specific valid 
combinations, to check on newly released combinations, and 
to obtain additional data on AMD's standard military grade 
products. 
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PIN DESCRIPTION 



CLK Clock (Input) 

Clock input to all registers. 

iNF F Register Enable (Input: Active LOW) 

When ENF is HIGH, the contents of the F register are static. 
When ENF is LOW, the ALU output data is clocked into the 
F register on the next LOW-to-HIGH transition of CLK. Note 
that the F register can be made transparent by setting the 
mode register bit M17 HIGH (as described in the Mode 
Register De scrip tion section); when the F register is 
transparent, ENF has no effect. 



ENI Instruction Register Enable (Input; Active LOW) 

When ENI is LOW, an instruction word is clocked into the 
instruction register on the next LOW-to-HIGH transition of 
CLK. The instruction word comprises the following fields: P, 
Q, and T-multiplexer control inputs, rounding modes, ALU 
instruction inputs, and the precision of the output operand. 
ENR R Re gister Enable (Input; Active LOW) 
When ENR is HIGH, the contents of the R register are static. 
When ENR is LOW, new data is loaded into the R register 
on the next LOW-to-HIGH transition of CLK. 



ENRF Regis ter File Enable (Input; Active LOW) 

When ENRF is HIG H, the contents of the register file are 
static. When ENRF is LOW, the ALU output operand is 
clocked into the register file on the next LOW-to-HIGH 
transition of CLK. 



ENS S Register Enable (Input; Active LOW) 

When ENS is HIGH, the contents of the S register are static. 
When ENS is LOW, new data is loaded into the S register on 
the next LOW-to-HIGH transition of CLK. 

F0-F31 F* Output Bus (Output) 

FLAGi-FLAGe Flag Outputs (Output) 

The six flag outputs report the status of the last operation 
executed. 

FSEL Output Multiplexer Control (Input) 

When FSEL is HIGH, the most significant 32 bits of the 
output register are connected to the output driver. When 
FSEL is LOW, the least significant 32 bits of the output 
register are connected to the output driver. 

I0-I13 ALU Instruction Inputs (Input) 

I0-I13 select the operation to be performed by the ALU. 
MSERR Master/Slave Error Flag (Output) 
A HIGH level indicates a master/slave error on the current 
output. 



FUNCTIONAL DESCRIPTION 
Overview 

The Am29C327 is a high-performance, single-chip, double- 
precision floating-point processor. 

Architecture 

The Am29C327 comprises a high-speed ALU, a 64-bit data 
path, and control circuitry. 

The core of the Am29C327 is a 64-bft floating-point/integer 
ALU. This ALU takes operands from three 64-bit input ports 
and performs the selected operation, placing the result on a 
64-bit output port. Thirteen ALU flags report operation status 
via the 7-bit Flag port. The ALU is completely combinatorial for 



OEF F Output Bus Enable (Input; Active LOW) 

When OEF is HIGH, signa ls F0-F31 assume a high- 
impedance state. When OEF is LOW (and SLAVE is HIGH), 
the output of the F multiplexer is placed on F0-F31. 

OlS Flag Output Enable (input) 
When OES is HIGH, outputs SIGN and FLAGi through 
FLAGg assume a high-impedance state. When SK is LOW 
(and SLAVE is HIGH), these signals are enabled. 

PSEL0-PSEL3 P-Multlplexer Control Inputs (Input) 
PSELo - PSELa select the data input to the ALU P-port. 

QSEL0-QSEL3 Q-Multiplexer Control Inputs (Input) 

QSELo - QSEL3 select the data input to the ALU Q-port. 
Rq-Rsi R Input Bus (Input) 

RFSEL0-RFSEL2 Register File Select (Input) 

RFSELo-RFSEL2 select the register file location 
(RFo - RF7) to which the ALU result is to be written. Data is 
written to the register file if ENRF is LOW. 

RM0-RM2 Round Mode Control Inputs (Input) 

The Am29C327 supports six rounding modes. RM0-RM2 
select the rounding mode to be applied to the current 
operation. 

S0-S31 S Input Bus (Input) 

S/DF F Output Single/Double Control (Input) 

When S/DF is HJGH, the ALU generates a single-precision 
result. When S/DF is LOW, the ALU generates a double- 
precision result 

S/DR R Input Single/Double Control (Input) 

When S/DR is HIGH, the data loaded into the R-port is 
treated as single precision. When S/DR is LOW, the data 
loaded into the R register is treated as double precision. 

S/DS S Input Single/Double Control (Input) 
When S/PS is HIGH, the data loaded into the S-port is 
treated as single precision. When S/DS is LOW, the data 
loaded into the S register is treated as double precision. 

SIGN Sign Flag (Output) 
If the final result of the last operation was negative, SIGN is 
HIGH. If the final result of the last operation was not 
negative, SIGN is LOW. 



SLAVE Maste r/Slave Mode Select (Input) 

When SLAVE is LOW, SLAVE mode is selected. In this 
mode, all outputs except MSERR are disabled. When 
SLAVE is HIGH, MASTER mode Is selected, 

TSEL0-TSEL3 T-Multiplexer Control Inputs (Input) 

TSEL0-TSEL3 select the data input to the ALU T-port. 



reduced latency; optional pipelining is available to boost 
throughput for array operations. 

The data path consists of the 32-bit input buses R and S; two 
64-bit input operand registers; an 8-by-64-bit register file for 
storage of intermediate results; three operand-selection multi- 
plexers that provide for orthogonal selection of input oper- 
ands; a 64-brt output register; and an output multiplexer that 
permits the selection of 32 MSBs, or 32 LSBs of data. Input 
operands enter the processor through the R and S buses, and 
are then demultiplexed and buffered for subsequent storage in 
registers R and S. The operand selection multiplexers route 
the operands to the ALU. Operation results are stored in 
register F, and leave the device on the 32-bit output bus F. 
The results can also be stored in the register file for use in 
subsequent operations. 
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Instruction Set 

The Ani29C327 implements 58 arithmetic and logical instruc- 
tions. Thirty-five instructions operate on floating-point num- 
bers; these instructions fall into the following categories: 

• Addition/subtraction 

• Multiplication 

• Multiplication-accumulation 

• Comparison 

• Selecting the larger or smaller of two numbers 

• Rounding to integral value 

• Absolute value, negation 

• Reciprocal seed generation 

• Conversion between any of the supported floating-point 
formats 

• Conversion of a floating-point number to an integer format, 
with or without a scale factor 

• Pass operand 

By concatenating these operations, the user can also perform 
division, square-root extraction, polynomial evaluation, and 
other functions not implemented directly. 

Twenty-two instructions operate on integers, and belong to the 
following general categories: 

• Addition/subtraction 

• Multiplication 

• Comparison 

• Selecting the smaller or larger of two numbers 

• Absolute value, negation, pass operand 



• Logical operations; e.g., AND, OR, XOR, NOT 

• Arithmetic, logical, and funnel shifts 

• Conversion between single- and double-precision integer 
formats 

• Conversion of an integer number to a floating-point format, 
with or without a scale factor 

One special instruction is provided to move data. 

Mixed-Precision Operations 

All Am29C327 instructions, floating-point or integer, can be 
performed with either single- or double-precision operands. In 
addition, the user can elect to mix precisions within an 
operation. All operations are performed in double-precision 
internally; the user specifies the precisions of the input 
operands and the required precision for the output operand. 
The necessary precision conversions are made in concert with 
the selected operation, with no additional cycle-time over- 
head. 

I/O Modes 

The Am29C327 supports eight I/O modes that afford flexible 
interface to a variety of 32- and 64-bit systems. 

Fault Detection Features 

The Am29C327 contains special comparison hardware to 
allow the operation of two processors in parallel, with one 
processor (the slave) checking the results produced by the 
other (the master). This feature is of particular importance in 
the design of high-reliability systems. 
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Block Diagram Description 

A block diagram of the Am29C327 is shown in Figure 1 . The 
Am29C327 comprises input registers, operand selection multi- 
plexers, instruction register, ALU, output register/register file, 
status register, output selection multiplexer, mode register, 
and the master/slave comparator. 

input Registers/Input Modes 

Operands enter the processor through the R and S buses, and 
are then demultiplexed and buffered for subsequent storage in 
the 65-bit registers R and S. Input operands may be either 
single-precision (32-bit) or double-precision (64-bit) as speci- 
fied by S/DR and S/DS. Accompanying the input registers are 
two 32-bit temporary registers, R-Temp and S-Temp, that 
allow for the overlapping of operand transfers and ALU 
operations. This arrangement of temporary registers and 
demultiplexers permits data and corresponding precision bit 
S/DR or S"/DS to be loaded into the 65-bit R register and 65- 
bit S register via one of the eight input modes: 



32-bit-bus, 
32-bit-bus, 
32-bit-bus, 
32-bit-bus, 
64-bit-bus, 
64-bit-bus, 
64-bit-bus, 



8. 64-bit-bus, 



double-cycle, LSWs first 
double-cycle, MSWs first 
single-cycle, LSWs first 
single-cycle, MSWs first 
double-cycle, R first 
double-cycle, S first 
single-cycle, R first 
single-cycle, S first 



These modes are described in detail in the Input Modes 
Description section. 

Operand Selection Multiplexers 

The operand selection multiplexers route operands to the 
ALU. These multiplexers, as well as selecting operands from 
input registers R and S and register file locations RFO - RF7, 
also have access to a set of constants (0, 0.5, 1, 2, 3, Pi). 
These constants are double-precision preprogrammed num- 
bers for use in ALU operations, and are automatically provided 
in the appropriate floating-point or integer format. 

Instruction Register 

The instruction register stores a 32-bit word specifying the 
current processor operation. Included in the instruction word 
are fields that specify the P, Q, and T multiplexer selects, the 
rounding modes; the core operation to be performed by the 
ALU; sign-change controls for ALU input and result operands; 
and the single/double-precision control for the output oper- 
and. The multiplexer selects and the instruction word are 
described in detail in the Instruction Set section; Rounding 
modes are described in Appendix B. 

ALU 

The ALU is a combinatorial arithmetic/logic unit that performs 
a large repertoire of floating-point and integer operations. The 



ALU has three operand inputs, and performs operations of the 
form (P*Q) + T. Most ALU operations require only one or two 
input operands; for example, addition requires only operands 
P and T, multiplication only operands P and Q, and precision 
conversion only operand P. Many ALU arithmetic operations 
allow for the independent control of operand signs, thus 
greatly increasing the number of arithmetic expressions that 
can be evaluated in a single ALU pass. 

The ALU can be configured in either a flow-through mode, for 
which the ALU is completely combinatorial, or a pipelined 
mode, for which ALU operations incur one or two pipeline 
delays, but which results in a higher throughput than flow- 
through mode. 

A detailed description of ALU operations appears in the 
Instruction Set section. 

Output Register/Register File 

The results of the operations performed by the ALU are stored 
in the 64-blt output register F. Results can also be stored in 
the 8-by-64-bit register file for use in subsequent operations. 
Each register file location contains a 65th bit indicating the 
precision of the operand stored in that location, thus permitting 
the ALU to correctly process the operand in subsequent 
operations. 

Status Register 

The status register is a 7-bit register that stores flags 
pertaining to the most recently performed operation. A de- 
tailed description is provided in the Instruction Set section. 

Output Multiplexer 

The output multiplexer routes operation results to the F bus. 
This multiplexer selects the 32 MSBs of the output register or 
the 32 LSBs. 

Master/Slave Comparator 

Each Am29C327 output signal has associated logic that 
compares that signal with the signal that the processor Is 
providing internally to the output driver; any discrepancies are 
indicated by assertion of signal MSERR. 

For a single processor, this output comparison detects short 
circuits in output signals or defective output drivers, but does 
not detect open circuits. It is possible to connect a second 
processor in parallel with the first, with th e secon d processor's 
outputs disabled by assertion of signal SLAVE. The second 
processor detects open-circuit signals, as well as providing a 
check of the outputs of the first 

Mode Register 

The mode register contains processor parameters that are 
changed infrequently. The 32-bit mode word is loaded into the 
register via the R bus. A detailed description of the mode 
register is provided in the Mode Register Description section. 
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Mode Register Description 

The "Load Mode Register" instruction loads a 32-bit word 
appearing on the R port into the mode register. Data is 
clocked into the register on the LOW-to-HIGH transition of 
CLK. The register is organized as descritied below: 

M0-M3 — Floating-Point Format Select: 



iU1 


MO 


Primary Format 





1 

1 




1 



1 


IEEE 

DEC F (SINGLE), DEC (DOUBLE) 

DEC F (SINGLE), DEC G (DOUBLE) 

IBM 


IMS 


M2 


Alternate Format 





1 
1 



1 


1 


IEEE 

DEC F (SINGLE), DEC D (DOUBLE) 

DEC F (SINGLE), DEC G (DOUBLE) 

IBM 



Primary and Alternate Floating-Point Formats 

All floating-point operations with the appropriate precisions 
are performed in the primary format selected by mode register 
bits MO and M1 except for the two following operations: 

1. "Convert T to Alternate Floating-Point Format" in 
which the T operand is in the Primary Floating-Point 
Format selected by mode register bits MO and M1, 
and the result generated is in the Alternate Float- 
ing-Point Format specified by mode register bits M2 
and M3. 

2. "Convert T from Alternate Floating-Point Format" 
in which the T operand is in the Alternate Floating- 
Point Format specified by mode register bits M2 
and M3, and the result is in the Primary Floating- 
Point Format specified by mode register bits MO 
and M1. 

Conversion or Scaling from Integer to Floating-Point gener- 
ates a floating-point result in the Primary Floating-Point Format 
selected by mode register bits MO and Ml. 

When mode register bits M2 and M3 are not used to specify an 
Alternate Floating-Point Format, they are "don't cares". 

Floating-point formats are discussed in further detail in Appen- 
dix A. 

M4 — Saturate Enable: If M4 is HIGH, overflowed results are 
replaced by the largest representable value in the selected 
format of the same sign as the overflowed result. If M4 is 
LOW, the result is not changed. If M6 is HIGH and the result 
format is IEEE, saturation Is disabled. 



MS — IEEE Aft Ine/Proiective Select: If M5 is HIGH, affine 
mode is selected. If M5 is LOW, projective mode is selected. 
The interpretation of infinities is determined by M5. The only 
differences between the modes occur during the addition and 
subtraction of infinities. 



Operation 


Affine Mode 


Projective Mode 


(+00) + (■!-=») 


Output +" 


Output Quiet NAN, set 
invalid and resen/ed 

operand flags 


(_») + (_oo) 


Output -■» 


Output Quiet NAN, set 
invalid and resen/ed 
operand flags 


(+") - (-") 


Output +00 


Output Quiet NAN, set 
invalid and resen/ed 
operand flags 


(-") - (+~) 


Output -«> 


Output Quiet NAN, set 
invalid and reserved 
operand flags 



If the current floating-point format is not IEEE, this bit has no 
effect. 

M6 — IEEE Trap Enable: If M6 is HIGH and the result format 
is IEEE, IEEE trapped operation is enabled; the saturate (M4) 
and sudden underflow (M7) bits are ignored. For an under- 
flowed result, the exponent is replaced by e = e -^ 192 (SP), or 
e = e -I- 1 536 (DP), with the significand unchanged. For an 
overflowed result, the exponent is replaced by e = e - 1 92 
(SP), or e = e - 1536 (DP), with the significand unchanged. If 
M6 is LOW and the result format is not IEEE, IEEE trapped 
operation is disabled. 

M7 — IEEE Sudden Underflow Enable: If M7 is HIGH and 
IEEE traps are disabled (MS LOW), all IEEE denormalized 
results are replaced by a zero of the same sign. If M7 is LOW, 
a valid denormalized number will be produced. This bit has no 
effect for result formats other than IEEE. 

MB — IBM Significance Mask Enable: If M8 is HIGH, certain 
IBM operations having intermediate results of will produce a 
final result of with the biased exponent unchanged. If M8 is 
LOW, these operations will produce a final result of true-zero. 
This bit has no effect for result formats other than IBM. 

M9 — IBM Underflow Mask Enable: If M9 is HIGH, certain 
underflowed IBM operations will produce a normalized result 
with the exponent replaced by e + 1 28. If M9 is LOW, these 
operations will produce a final result of true-zero. This bit has 
no effect for result formats other than IBM. 

M10: Reserved for future use (must t>e set to Logic 0) 

M11 — Integer Multiplication Signed/Unsigned Select: If 

M11 is HIGH, the input operands are treated as two's- 
complement numbers. If M11 is LOW, the input operands are 
treated as unsigned numbers. This bit has no effect for 
operations other than integer multiplication. 

M12, M13 — Integer Multiplication Format Adjust: Selects 
the output format for integer multiplications. The user may 
select either the MSBs or the LSBs of the result of an integer 
multiplication: 



M13 


M12 


Output Format 





1 

1 



1 


1 


LSBs 

LSBs, format-adjusted 

MSBs 

MSBs, format adjusted 



"Format-adjusted" indicates that the product is shifted left 
one place before the MSBs or LSBs are selected. 
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M14-M16 — Input Mode: Selects the input bus mode: 



M16 


M15 


M14 


Input Mode 











32-bit-bus, single-cycle, LSW 

first 








1 


32-bit-bus, single-cycle, MSW 
first 





1 





32-bit-bus, double-cycle, LSW 
first 





1 


1 


32-bit-bus, double-cycle, MSW 
first 


1 








64-bit-bus, single-cycle, R first 


1 





1 


64-bit-bus, single-cycle, S first 


1 


1 





64-bit-bus, double-cylce, R first 


1 


1 


1 


64-bit-bus, double-cycle, S first 



Additional information on input modes can be found in 
the Input Modes section. 

M17-F Register Feedthrough Enable: When M17 is HIGH, 
register F is made transparent. When Ml 7 is LOW, the ALU 
output data is clocked into the F register on the next LOW-to- 
HIGH transition of CLK. 

M18- Status Register Feedthrough Enable: When M18 is 
HIGH, the status register is made transparent When M18 is 
LOW, the output flags are clocked into the status register on 
the next LOW-to-HIGH transition on CLK. 

M19, M20- Pipeline Mode Select: 



M20 


M19 


Pipeline Mode 





X 


Flow-through mode 


1 





Single-pipeline 
mode for all opera- 
tions 


1 


1 


Double-pipeline 
mode for multiply/ 
accumulate 
Single-pipeline 
mode for other 
operations 



Input Modes 

The Am29C327 supports a total of eight input modes for 
loading data into the R and S registers. 

The 32-bit bus modes allow the user to connect each input 
port (R0-R31 and S0-S31) to separate 32-bit buses. 64-bit 
operands can then be loaded by placing the MSBs and LSBs 
alternately on the appropriate ports. In the 64-bit bus modes, 
the two input ports are configured internally as a single 64-bit 
port. The Am29C327 may then be connected directly to a 64- 
bit bus, and 64-bit operands may be loaded in single opera- 
tion. Either the 32-bit bus modes or the 64-bit bus modes may 
be used regardless of the precision of the operands being 
transferred — the choice of input modes will in practice tie 
determined by the system into which the Am29C327 is to be 
integrated. 

Single-cycle input modes allow two 64-bit operands to be 
loaded in a single clock cycle. This necessitates driving the 
input buses at twice the speed of the Am29C327. For systems 
when this is not practical, the double-cycle modes allow the 
loading of one 64-bit operand (or two 32-bit operands) per 
clock cycle. 

Data may be loaded from the input buses to the R register and 
S register using one of the eight input modes: 



32-Bit Bus, 

32-Bit Bus, 

32-Bit Bus, 

32-Bit Bus, 

64-Bit Bus, 

6. 64-Bit Bus, 

7. 64-Bit Bus, 

8. 64-Bit Bus, 



Single-Cycle, LSWs First 
Single-Cycle, MSWs First 
Double-Cycle, LSWs First 
Double-Cycle, MSWs First 
Single-Cycle, R First 
Single-Cycle, S First 
Double-Cycle, R First 
Double-Cycle, S First 



M21 - M31 - Reserved for factory test (must be set to Logic 0) 



The choice of the input modes is determined by mode register 
bits M14-M16. 

In order to permit the loading of new operands to be 
overlapped with the execution of a current operation, tempo- 
rary registers are provided within the "operand router" block 
(shown in Figure 1). The operation of these temporary 
registers is transparent to the user. The conditions under 
which they are loaded depends on the input mode selected. 

The eight input modes are described on the following pages. 
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32-Blt Bus, Single-Cycle, LSW First (M16 = 0, M15 = 0, 
M14 = 0) 

In this mode, tlie two tialves of tiie 64-bit R operand are 
placed on the R-input bus in successive half-cycles, with the S 



operand similarly placed on the S-input port. After one 
complete cycle, the R and S registers contain the R and S 
operands, respectively. 
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Timing of Operations with Input Mode 1 
(32-Bit Bus, Single-Cycle, LSW First)* 

"Assumes flow-through operation, F register, and S register clocked. 



In this mode, the temporary registers are clocked on every 
HIGH-to-LOW clock transition. 

At 1, the least-significant 32 bits of the R operand are loaded 
from the R-input port Into the R-temp register, and the least- 
significant 32 bits of the S operand are loaded from the S-input 
port into the S-temp register. Both words are loaded on the 
HIGH-to-LOW transition of the clock. 

At 2, the most-significant 32 bits of the R operand are loaded 
from the R-input port into the most-significant half of the R 



register, and the most-significant 32 bits of the S operand are 
loaded from the S-input port into the most-significant half of 
the S register. 

At the same time, at 2, the output of the R-temp register is 
loaded into the least-significant half of the R register, and the 
output of the S-temp register is loaded into the least- 
significant half of the S register. 

If an input operand is single-precision, the 32-bit data is kept 
on the input bus for the full cycle. 
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32-Bit Bus, Single-Cycle, MSW First (M16 = 0, M15 = 0, 
M14 = 1) 

In this mode, the two halves of the 64-bit R operand are 
placed on the R-input bus in successive half-cycles, with the S 



operand similarly placed on the S-input port. After one 
complete cycle, the R and S registers contain the R and S 
operands, respectively. 
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Timing of Operations with Input Mode 2 
(32-Bit Bus, Single-Cycle, MSW First)* 

"Assumes flow-through operation, F register, and S register clocked. 



In this mode, the temporary registers are clocked on every 
HIGH-to-LOW clock transition. 

At 1 , the most-significant 32 bits of the R operand are loaded 
from the R-input port into the R-temp register, and the most- 
significant 32 bits of the S operand are loaded from the S-lnput 
port into the S-temp register. Both words are loaded on the 
HIGH-to-LOW transition of the clock. 

At 2, the least-significant 32 bits of the R operand are loaded 
from the R-input port Into the least-significant half of the R 



register, and the least-significant 32 bits of the S operand are 
loaded from the S-input port into the least-significant half of 
the S register. 

At the same time, at 2, the output of the R-temp register is 
loaded into the most-significant half of the R register, and the 
output of the S-temp register is loaded into the most- 
significant half of the S register. 

If an input operand is single-precision, the 32-bit data is kept 
on the input bus for the full cycle. 
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32-Bit Bus, Double-Cycle, LSW First (M16 = 0, M15 = 1, 
M14 = 0) 

In this mode, the two halves of the 64-bit R operand are 
placed on the R-input bus in successive cycles, with the S 



operand similarly placed on the S-input port. After two cycles, 
the R and S registers contain the R and S operands, 
respectively. 
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Timing of Operations witli Input Mode 3 
(32-Bit Bus, Double-Cycle, LSW First)* 

'Assumes flow-through operation, F register, and S register clocked. 



In this mode, the temporary registers are clocked on every 
LOW-to-HIGH clock transition. 

At 1 , the least-significant 32 bits of the R operand are loaded 
from the R-input port into the R-temp register, and the least- 
significant 32 bits of the S operand are loaded from the S-input 
port into the S-temp register. 

At 2, the most-significant 32 bits of the R operand are loaded 
from the R-input port into the most-significant half of the R 



register, and the most-significant 32 bits of the S operand are 
loaded from the S-input port into the most-significant half of 
the S register. 

At the same time, at 2, the output of the R-temp register is 
loaded into the least-significant half of the R register, and the 
output of the S-temp register is loaded into the least- 
significant half of the S register. 
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32-Bit Bus, Double-Cycle, MSW First (M16 = 0, M15 = 1, 
M14 = 1) 

in this mode, tile two halves of the 64-bit R operand are 
placed on the R-input bus in successive cycles, with the S 



operand similarly placed on the S-input port. After two cycles, 
the R and S registers contain the R and S operands, 
respectively. 



© 



© 



Rn-R 



0'"31 



Sn-S, 



I 

Y "msw Y "lsw Y 

' ! 

Y ^MSW Y ®LSW Y 



!r4STnuCT!0N 

LINES, SIDR. 

S/DS 



ENR 

Ins 

ENI 




Fo'^31 
FLAGS, SIGN 



Timing of Operations with Input lUode 4 
(32-Bit Bus, Double-Cycle, MSW First)* 

'Assumes flow-through operation, F register, and S register clocked. 



In this mode, the temporary registers are clocked on every 
LOW-to-HiGH clock transition. 

At 1, the most-significant 32 bits of the R operand are loaded 
from the R-input port into the R-temp register, and the most- 
significant 32 bits of the S operand are loaded from the S-input 
port into the S-temp register. 

At 2, the least-significant 32 bits of the R operand are loaded 
from the R-input port into the least-significant half of the R 



register, and the least-significant 32 bits of the S operand are 
loaded from the S-input port into the least-significant half of 
the S register. 

At the same time, at 2, the output of the R-temp register is 
loaded into the most-significant half of the R register, and the 
output of the S-temp register is loaded into the most- 
significant half of the S register. 
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64-Btt Bus, Single-Cycle, R First (M16= 1, M15 = 0, 
M14 = 0) 

In this mode, the MSW of the 64-bit R operand is placed on 
the R-input bus and the LSW of the S-input bus. Both 



halfwords are loaded in the first half cycle. Similarly, the two 
halves of the S operand are loaded in the second half cycle. 
After one full cycle, the R and S registers contain the R and S 
operands, respectively. 



© © 



CLK — I 



Rq-Rsi X "msw Y^msw 



So -531 VRlSwYslsw 



INSTRUCTION 

LINES, S/DR, 

S/DS 



ENR 

EH5 

ENI 




FLAGS, SIGN 



WF024920 



Timing of Operations with Input Mode 5 
(64-Bit Bus, Single-Cycle, R First)* 

•Assumes flow-through operation, F register, and S register clocl<ed. 



In this mode, the temporary registers are clocked on every 
HIGH-to-LOW clock transition. 

At 1, the most-significant 32 bits of the R operand are loaded 
from the R-input port into the R-temp register, and the least- 
significant 32 bits of the R operand are loaded from the S- 
input port into the S-temp register. 

At 2, the most-significant 32 bits of the S operand are loaded 
from the R-input port Into the most-significant half of the S 



register, and the least-significant 32 bits of the S operand are 
loaded from the S-input port into the least-significant half of 
the S register. 

At the same time, at 2, the output of the R-temp register Is 
loaded Into the most-significant half of the R register, and the 
output of the S-temp register Is loaded Into the least- 
significant half of the R register. 
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64-Bit Bus, Single-Cycle, S First (M16 = 1, M15 = 0, 
M14 = 1) 

In tills mode, the MSW of the 64-bit S operand Is placed on the 
R-input bus and the LSW on the S-lnput bus. Both halfwords 



are loaded In the first half cycle. Similarly, the two halves of 
the R operand are loaded in the second half cycle. After one 
full cycle, the R and S registers contain the R and S operands, 
respectively. 



© 



CLK 



f'o-f'ai A 2 MSW X "msw 



S0-S31 Vslsw Yrlsw 



INSTRUCTION 

LINES, S/DR, 

S/DS 



ENR 
ENI 




FLAGS, SIGN 



Timing of Operations with Input lUlode 6 
(64-Bit Bus, Singie-Cycle, S First)* 

'Assumes flow-through operation, F register, and S register clocked. 



In this mode, the temporary registers are clocked on every 
HIGH-to-LOW clock transition. 

At 1 , the most-significant 32 bits of the S operand are loaded 
from the R-lnput port into the R-temp register, and the least- 
significant 32 bits of the S operand are loaded from the S-lnput 
port into the S-temp register. 

At 2, the most-significant 32 bits of the R operand are loaded 
from the R-input port into the most-significant half of the R 



register, and the least-significant 32 bits of the R operand are 
loaded from the S-input port into the least-slgnlflcant half of 
the R register. 

At the same time, at 2, the output of the R-temp register Is 
loaded Into the most-significant half of the S register, and the 
output of the S-temp register is loaded into the least- 
significant half of the 8 register. 
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64-Bit Bus, Double-Cycle, R First (M16= 1, M15= 1, 
M14 = 0) 

In this mode, the MSW of the 64-bit R operand is placed on 
the R-input bus and the LSW of the S-input bus. Both 



halfwords are loaded in the first cycle. Similarly, the two halves 
of the S operand are loaded in the second cycle. After the two 
cycles, the R and S registers contain the R and S operands, 
respectively. 



o 



CLK — ' 



© 



Rn-Rq 



'MSW 



Rlsw 



=>LSW 



INSTRUCTION 
LINES, S/DR, 

sy5s 



ENR 
ENI 




FLAGS, SIGN 



XIZX 



Timing of Operations with Input Mode 7 
(64-Bit Bus, Double-Cycle, R First)* 

'Assumes flow-through operation, F register, and S register clocked. 



In this mode, the temporary registers are clocked on every 
LOW-to-HIGH clock transition. 

At 1 , the most-significant 32 bits of the R operand are loaded 
from the R-input port into the R-temp register, and the least- 
significant 32 bits of the R operand are loaded from the S- 
input port into the S-temp register. 

At 2, the most-significant 32 bits of the S operand are loaded 
from the R-input port into the most-significant half of the S 



register, and the least-signifkiant 32 bits of the S operand are 
loaded from the S-input port into the least-significant half of 
the S register. 

At the same time, at 2, the output of the R-temp register is 
loaded into the most-significant half of the R register, and the 
output of the S-temp register is loaded into the least- 
significant half of the R register. 
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64-Bit Bus, Double-Cycle, S First (M16= 1, M15 = 1, 
M14 = 1) 

In this mode, the MSW of the 64-bit S operand is placed on the 
R-input bus and the LSW of the S-input bus. Both halfwords 



are loaded in the first cycle. Similarly, the two halves of the R 
operand are loaded in the second cycle. After the two cycles, 
the R and S registers contain the R and S operands, 
respectively. 



© 



CLK — I 



© 



Smsw 



Sft-S, 



0'='31 



'LSW 



INSTRUCTION 

LINES, sm 

S/DS 



ENR 
ENI 




FLAGS, SIGN 



Timing of Operations with Input IMode 8 
(64-Bit Bus, Double-Cycle, S First)* 

"Assumes flow-through operation, F register, and S register clocked. 



In this mode, the temporary registers are clocked on every 
LOW-to-HIGH clock transition. 

At 1 , the most-significant 32 bits of the S operand are loaded 
from the R-input port inot the R-temp register, and the least- 
significant 32 bits of the S operand are loaded from the S-input 
port into the S-temp register. 

At 2, the most-significant 32 bits of the R operand are loaded 
from the R-input port into the most-significant half of the R 



register, and the least-significant 32 bits of the R operand are 
loaded from the S-input port into the least-significant half of 
the R register. 

At the same time, at 2, the output of the R-temp register is 
loaded into the most-significant half of the S register, and the 
output of the S-temp register is loaded into the least- 
significant half of the S register. 
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Pipelining of Operations 

The floating-point ALU of the Am29C327 may be operated in 
one of three pipeline modes: 

1. Flow-Through Mode 

2. Single-Pipelined Mode 

3. Doubie-Pipeiined Mode 

Flow-Through Mode 

In this mode the floating-point ALU acts as a purely combina- 
torial device. 

Single-Pipelined Mode 

in this mode the floating-point ALU contains a single pipeline 
delay for all operations; throughput Is roughly double that for 
unpipelined mode. Simplified diagrams for the ALU configura- 
tion for single-pipelined mode are shown in Figure 2. 

Doubie-Pipeiined Mode 

in this mode, which applies only to the multiplication-accumu- 
lation operation, the ALU contains two pipeline delays; 
throughput is roughly triple that for the unpipelined multiplica- 
tion-accumulation operation. Simplified block diagrams are 
shown in Figure 3. 

Figures 4 and 5 provide timing diagrams for all operations 
except multiply-accumulate, illustrating flow-through mode and 
pipelined mode, respectively. Figures 6, 7, and 8 provide 
timing diagrams for multiply-accumulate, illustrating flow- 
through mode, single-pipelined mode, and double-pipelined 
mode, respectively. 



The choice of pipelining mode affects only the floating-point 
ALU. Operations of other parts of the Am29C327, such as the 
input registers, the output register, the mode register, and the 
instruction register are not affected by the choice of pipelining 
mode. However, the instruction bits are pipelined as they pass 
through the ALU. This permits instructions to be interleaved in 
pipelined mode. 

The desired pipeline mode or modes can be invoked by setting 
mode register bits Ml 9 and M20 to the appropriate values. 

When using the Am29C327 in either single-pipelined or 
double-pipelined mode, two conditions must be observed: 

1 . The "load mode register" instnjction is not pipelined, nor 
are any of the mode register bits. When the mode register 
is loaded, any differences between the current mode and 
the previous mode take effect immediately. In single- 
pipelined mode, the user should separate the last valid 
ALU instruction and the "load mode register" instruction 
with one "NO-OP" instruction. In double-pipelined mode, 
the user should separate them with two "NO-OP" instruc- 
tions. A NO-OP instruction is any instruction whose result 
is not stored in register F, or the register file. 

2. A multiplication-accumulation instruction cannot be imme- 
diately followed by any other type of instruction, This 
problem can be avoWed by inserting a "dummy" multipli- 
cation-accumuiatkjn instruction at the end of a multiplica- 
tion-accumulation instruction. This "dummy" is any in- 
struction whose results are not stored in register F or the 
register file. 
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Figure 2. ALU Configuration for Single-Pipelined Mode 
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Instruction Set 

Instruction Register Format 

The 14-bit instruction word I0-I13 comprises sign-change 
controls, integer/floating-point select bit, and the opcode. 



■l3 



I12 hi 



■10 19 



Is I7 



I4 I3 I2 h lo 



SIGN (P) 


SIGN (Q) 


SIGN (T) 


SIGN (F) 


INT/FP 


OPCODE 



The opcode field, 14 -lo, specifies the core operation to be 
performed by the ALU; instruction bit is selects between 



floating-point and integer formats. The core operations and 
their corresponding opcodes are listed in Table 1. 



TABLE 1. CORE OPERATIONS/OPCODES 



I5 


I4 


I3 


I2 


I1 


lo 


Operation (Floating-Point) 




















P 



















P + T 


















P*Q 


















COMPARE P, T 











1 






MAX P, T 











1 






MIN P, T 











1 






CONVERT T TO INTEGER 











1 






SCALE T TO INTEGER BY Q 

















(P * Q) -H T 

















ROUND T TO INTEGRAL VALUE 

















RECIPROCAL SEED OF P 

















CONVERT T TO ALTERNATE P.P. FORMAT 










1 






CONVERT T FROM ALTERNATE F.P. FORMAT 


■5 


I4 


I3 


l2 




lo 


Operation (Integer) 

















P 

















P + T 

















P*Q 

















COMPARE P, T 










1 






MAX P, T 










1 






MIN P, T 










1 






CONVERT T TO FLOATING-POINT 










1 






SCALE T TO FLOATING-POINT BY Q 
















P OR T 
















P AND T 
















P XOR T 
















SHIFT P LOGICAL PLACES 









1 







SHIFT P ARITHMETIC Q PLACES 









1 







FUNNEL SHIFT PT LOGICAL Q PLA.CES 



Core operations MOVE P and LOAD MODE REGISTER can both be performed in either floating-point or integer format: 



I5 


I4 


I3 


I2 


h 


■o 


Operation 


X 
X 


1 

1 


1 

1 




1 




1 




1 


MOVE P 

LOAD MODE REGISTER 
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Sign-Change Selects 

Each ALU input and output operand has associated hardware 
that can be used to modify operand signs (see Figure 9). 
These sign-change blocl<s, when applied to core operations, 
greatly increase the number of available operations. A core 
operation of P + T, for example, can be used to perform 
operations such as P - T, ABS(P + T), ABS(P) + ABSfT), and 
others, simply by modifying the signs of the input and output 
operands. 



Using the sign-change blocks, the sign of an input operand 
may be left unchanged, inverted, set to zero, or set to one; the 
sign of the output operand may be left unchanged, set to zero, 
set to one, set to the sign of the P input operand, or set to the 
sign of the T input operand. Select decodes for the P, Q, T, 
and F operand sign-change blocks are shown in Table 2-1 , 2- 
2, 2-3, and 2-4, respectively. 




* ?ign Clidnjf 



• Sign Cnang*. 



J 



V V 



ALU 



Sign Change 



Figure 9. ALU Sign-Change Biocks 



TABLE 2-1. SELECT DECODE FOR P OPERAND 
SIGN-CHANGE BLOCK 



TABLE 2-2. SELECT DECODE FOR Q OPERAND 
SIGN-CHANGE BLOCK 



lis 


Il2 


Sign (P') 





1 
1 




1 



1 


SIGN (P) 

SIGN (P) 



1 



111 


■lO 


Sign (Q-) 








SIGN (Q) 





1 


SIGN (Q) 


1 








1 


1 


1 



TABLE 2-3. SELECT DECODE FOR T OPERAND 
SIGN-CHANGE BLOCK 



l9 


l8 


Sign (T) 








SIGN T 





1 


SIGNT 


1 








1 


1 


1 



TABLE 2-4. SELECT DECODE FOR F OPERAND 
SIGN-CHANGE BLOCK 



Core Operation 


111 


■lO 


l7 


■6 


Sign (F) 


P, 





X 








SIGN (F) 


Max P, T 





X 





1 


SIGN (F) 


or 





X 


1 








Min P, T 





X 


1 


1 


1 




1 





X 


X 


SIGN (P) 




1 


1 


X 


X 


SIGN (T) 




X 


X 








SIGN (P) 


Other 


X 


X 





1 


SIGN (F') 




X 


X 


1 










X 


X 


1 


1 


1 
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Operand Multiplexer Selects 

Instruction fields PSEL0-PSEL3, QSEL0-QSEL3, and 
TSEt-o - TSEI^ specify the select codes for the P, Q, and T 



operand multiplexers, respectively; the codes are summarized 
in Table 3. 





TABLE 3. OPERAND MULTIPLEXER SELECT CODES 


PSEL3 


PSEL2 


PSEL1 


PSELo 


p 


QSEL3 


QSEL2 


QSEL1 


QSELo 


Q 


TSEL3 


TSEL2 


TSELi 


TSELo 


T 














R 











1 


S 
























1 


0.5 (Floating Point) 
-1 (Integer) 





1 







1 





1 





1 


2 





1 







3 





1 




1 


Pi (Floating Point) 

Max Neg. Two's-Comp. Value (Integer) 












Register File Location (RFO) 









1 


Register File Location 1 (RF1) 












Register File Location 2 (RF2) 









1 


Register File Location 3 (RF3) 




1 







Register File Location 4 (RF4) 




1 




1 


Register File Location 5 (RF5) 




1 







Register File Location 6 (RF6) 




1 


1 


1 


Register File Location 7 (RF7) 



Operand Precisions 

The Am29C327 supports mixed-precision operations, so that it 
is possible, for example, for an operation to have single- 
precision inputs and a double-precision output, or one single- 
and one double-precision input, or any other combination. 

Precision of the operands in registers R and S is specified by 
signals S/DR and S/DS. A logic HIGH indicates a single- 
precision operand or operands; a LOW, double precision. 

Precision of an operation result is specified by signal S/DF. A 
logic HIGH indicates a single-precision operand; a logic LOW, 
double-precision. 

Operands stored in the register file are each accompanied by 
a bit indicating that operand's precision; this precision informa- 



tion is automatically supplied to the ALU when a register file 
location is used as an Input operand to an operation. 

Processor Operations 

Table 4 illustrates a number of possible ALU instructions 
comprising the opcode, integer/floating-point select, and sign- 
change fields. Note that the remaining instruction bits — P, Q, 
and T operand multiplexer selects; the rounding modes; and the 
output operand precision — can be specified independently. 

The user may create instructions using instruction words other 
than those listed in Table 4. For some core operations, sign- 
change control settings are completely arbitrary; for others, 
only the sign-change field values shown in Table 4 are valid. 
Table 5 summarizes permissible sign-change field values for 
each core operation. 
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TABLE 4. INSTRUCTION WORDS 



Operation 


Sign 


l/F 


Opcode 


P 


Q 


T 


F 


FP 


P 


00 


00 


XX 


00 





00000 


FP 


-P 


00 


00 


XX 


01 





00000 


FP 


ABS (P) 


00 


00 


XX 


10 





00000 


FP 


Sign (T)*ABS (P) 


00 


11 


XX 


XX 





00000 


FP 


P + T 


00 


XX 


00 


00 





00001 


FP 


P-T 


00 


XX 


01 


00 





00001 


FP 


T-P 


01 


XX 


00 


00 





00001 


FP 


-P-T 


01 


XX 


01 


00 





00001 


FP 


ABS (P + T) 


00 


XX 


00 


10 





00001 


FP 


ABS (P-T) 


00 


XX 


01 


10 





00001 


FP 


ABS (P) + ABS (T) 


10 


XX 


10 


00 





00001 


FP 


ABS (P)-ABS (T) 


10 


XX 


11 


00 





00001 


FP 


ABS (ABS (P)-ABS (T)) 


10 


XX 


11 


10 





00001 


FP 


P * Q 


00 


00 


XX 


00 





00010 


FP 


(-P) * Q 


01 


00 


XX 


00 





00010 


FP 


ABS (P * Q) 


00 


00 


XX 


10 





00010 


FP 


Compare P, T 


00 


XX 


01 


00 





00011 


FP 


lUax P, T 


00 


00 ■ 


01 


00 





00100 


FP 


lUlax ABS (P), ABS (T) 


10 


00 


11 


00 





00100 


FP 


IMIn P, T 


01 


00 


00 


00 





00101 


FP 


Min ABS (P), ABS (T) 


11 


00 


10 


00 





00101 


FP 


Umit P to Magnitude T 


11 


10 


10 


XX 





00101 


FP 


Convert T to Integer 


XX 


XX 


00 


00 





00110 


FP 


Scale T to Integer by Q 


XX 


00 


00 


00 





00111 


FP 


T + P*Q 


00 


00 


00 


00 





01000 


FP 


T-P*Q 


01 


00 


00 


00 





01000 


FP 


-T + P*Q 


00 


00 


01 


00 





01000 


FP 


-T-P*Q 


01 


00 


01 


00 





01000 


FP 


ABS (T) + ABS (P*Q) 


10 


10 


10 


00 





01000 


FP 


ABS (T)-^ABS (P*Q) 


11 


10 


10 


00 





01000 


FP 


ABS (P*Q)-ABS (T) 


10 


10 


11 


00 





01000 


FP 


Round T to Integral Value 


XX 


XX 


00 


00 





01001 


FP 


Reciprocal Seed (P) 


00 


XX 


XX 


00 





01010 


FP 


Convert T to Alternate 
Floating-point Format 


XX 


XX 


00 


00 





01011 


FP 


Convert T from Alternate 
Floating-point Format 


XX 


XX 


00 


00 





01100 
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TABLE 4. INSTRUCTION WORDS (Cont'd) 






Operation 


Sign 


l/F 


Opcode 


P 


Q 


T 


F 


Int P 

Int -P 

Int ABS (P) 

Int sign (T)*ABS (P) 


00 
00 
00 

00 


00 
00 
00 

11 


00 
00 
00 
00 


00 
01 
10 

XX 




00000 
00000 
00000 
00000 


Int P + T 
Int P-T 
Int T-P 
Int ABS (P + T) 
Int ABS (P-T) 


00 
00 
01 
00 
00 


XX 
XX 
XX 
XX 
XX 


00 
01 
00 
00 
01 


00 
00 
00 
10 
10 




00001 
00001 
00001 
00001 
00001 


Int P * Q 


00 


00 


XX 


00 




00010 


Int Compare P, T 


00 


XX 


01 


00 




00011 


Int Max P, T 


00 


00 


01 


00 




00100 


Int Min P, T 


01 


00 


00 


00 




00101 


Int Convert T to Float 


XX 


XX 


00 


00 




00110 


Int Scale T to Float by Q 


XX 


00 


00 


00 




00111 


Int P OR T 


XX 


XX 


XX 


XX 




10000 


Int P AND T 


XX 


XX 


XX 


XX 




10001 


Int P XOR T 

Int NOT T (see Note 1) 


XX 
XX 


XX 
XX 


XX 
XX 


XX 
XX 




10010 
10010 


Int Shift P Logical Q Places 


00 


00 


XX 


00 




10011 


Int Sliift P Arithmetic Q Places 


00 


00 


XX 


00 




10100 


Int Funnel Shift PT Q Places 


00 


00 


00 


00 




10101 


Move P 


XX 


XX 


XX 


XX 


X 


11000 


Load Mode Register 


XX 


XX 


XX 


XX 


X 


11111 



Notes; 1. NOT T is pertormed by XORing T with a word containing all 1's (integer -1). When invoking NOT T the 
user must set PSELs - PSELo to 001 1 2, thus selecting integer constant - 1 . 
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TABLE 5. ALLOWABLE SIGN-CHANGE/CORE-OPERATION COMBINATIONS 



1 lllll 

5 43210 


Core Operation 


Sign-Change Fields 


Sign (P) 


Sign (Q) 


Sign (T) 


Sign (F) 


00000 


FP P 


V 


V 


X 


V 


00001 


FP P + T 


V 


X 


V 


V 


00010 


FP P*Q 


V 


V 


X 


V 


00011 


FP Compare P, T 


F 


X 


F 




00100 


FP Max P, T 


F 


F 


F 




00101 


FP Min P, T 


F 


F 


F 




00110 


FP Cvt T to Int 


X 


X 


F 




00111 


FP Scale T to Int 


X 


F 


F 




01000 


FP P*Q + T 


V 


V 


V 




01001 


FP Round T 


X 


X 


F 




01010 


FP Recip Seed P 


F 


X 


X 




01011 


FP Cvt T to Alt Fmt 


X 


X 


F 




01100 


FP Cvt T fm Alt Fnnt 


X 


X 


F 




1 00000 


Int P 


F 


F 


F 




1 00001 


Int P + T 


F 


X 


F 




1 00010 


Int P*Q 


F 


F 


X 


_ F 


1 00011 


Int Compare P, T 


F 


X 


F 




1 00100 


Int Max P, T 


F 


■ F 


F 




1 00101 


Int Min P, T 


F 


F 


F 




1 00110 


Int Cvt T to f.p. 


X 


X 


F 




1 00111 


Int Scale T to f.p. 


X 


F 


F 




1 10000 


Int P OR T 


X 


X 


X 


X 


1 10001 


Int P AND T 


X 


X 


X 


X 


1 10010 


Int P XOR T 


X 


X 


X 


X 


1 10011 


Int Shift P Logical 


F 


F 


X 


F 


1 10100 


Int Shift P Arith 


F 


F 


X 


F 


1 10101 


Int Funnel Shift PT 


F 


F 


F 


F 


X 11000 


Move P 


X 


X 


X 


X 


X 11111 


Load Mode Reg 


X 


X 


X 


X 



Key; V = Variable; user can specify arbitrary sign change. 

F » Fixed; user is restricted to sign change combinations shown in Table 4. 
X " Don't care; this fiefd does not affect the operation or its result. 



Descriptions of Operations 

P (Floating-Point or Integer): The operand on port P is 
passed through the ALU to port F. This operation may be used 
to change the precisiori of an operand, negate an operand, 
extract the absolute value of an operand, or transfer the sign 
of operand T to operand P. 

P + T (Floating-Point or Integer): The addition operation 
(P + T) adds the operands on ports P and T, and places the 
result on port F. 

P*Q (Floating-Point or Integer): The multiplication operation 
(P*Q) multiplies the operands on ports P and Q, and places 
the result on port F. 

COMPARE P, T (Floating-Point or Integer): This operation 
compares the operands on ports P and T, and places (P - T) 
on port F. One of four comparison flags ( = , > , < , #) is set 
according to the result of the comparison. Note that the 
unordered flag (#) can be set only when the format selected 
is IEEE or DEC. 

MAX P, T (Floating-Point or Integer): This operation selects 
the most positive of the two operands on ports P and T, and 
places the result on port F. 

MIN P, T (Floating-Point or Integer): This operation selects 
the most negative of the two operands on ports P and T, and 
places the result on port F. 

LIMIT P TO MAGNITUDE T (Floating-Point): This operation 
imposes a clipping or saturation level on jsperand P by 



comparing the magnitudes of the operands on ports P and T If 
operand P has the smaller magnitude, it is placed on port F; if 
operand T has the smaller magnitude, it is placed on port F, 
but with its sign modified to agree with that of operand P. This 
operation is equivalent to operation SIGN(P) * MIN( ABS(P), 
ABS(T) ). 

CONVERT T TO INTEGER (Floating-Point): The floating- 
point-to-integer conversion operation tal<es a floating-point 
operand on port T and places the equivalent two's-comple- 
ment integer value on port F. 

CONVERT T TO FLOATING-POINT (Integer): The Integer- 
to-floating-point conversion operation tal<es a two's-comple- 
ment integer operand on port T and places the equivalent 
floating-point value on port F. 

SCALE T TO INTEGER BY Q (Floating-Point): This opera- 
tion converts the floating-point operand T to integer format 
using the floating-point operand Q as a scale factor. The true 
exponent of Q is added to the true exponent of T before the 
new value T is converted to integer format. The operation 
therefore permits T to be multiplied by any power of two when 
the source format is IEEE or DEC, and by any power of 16 
when the source format is IBM. 

SCALE T TO FLOATING-POINT BY Q (Integer): This opera- 
tion converts the integer operand T to floating-point format 
using the operand Q as a scale factor, where Q is a floating- 
point operand in the destination format. The true exponent of 
Q is added to the true exponent of T after T has been 
converted from integer to floating-point. The operation 
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therefore permits T to be scaled by any multiple of two wfien 
the destination format is IEEE or DEC, and by any multiple of 
16 when the destination format is IBM. 

(P*Q) + T (Floating-Point): This operation multiplies the oper- 
ands on port P and Q, adds the product to the operand on port 
T, and places the result on port F. 

ROUND T TO INTEGRAL VALUE (Floating-Point): This 
operation rounds a floating-point operand to an integer-valued 
floating-point operand of the same format. A value of 3.5, for 
example, would be rounded to either 3.0 or 4.0, the choice 
depending on the rounding mode. 

RECIPROCAL SEED OF P (Floating-Point): The reciprocal 
seed of the floating-point operand on port P is placed on port 
F; the result obtained is a crude estimate of the input 
operand's reciprocal. This operation can be used as the initial 
step in performing Newton-Raphson division. A single-preci- 
sion result is obtained after five iterations, and a double- 
precision result after six iterations. Alternately, an external 
seed look-up table can be used for faster convergence. The 
result obtained through iteration is approximate. 

CONVERT T TO ALTERNATE FLOATING-POINT FORMAT 
(Floating-Point): This operation converts operand T from the 
primary floating-point format to the alternate floating-point 
format, thus allowing conversions among the IEEE, DEC, and 
IBM floating-point formats. 

CONVERT T FROM ALTERNATE FLOATING-POINT FOR- 
MAT (Floating-Point): This operation converts operand T 
from the alternate floating-point format to the primary floating- 
point format, in a manner similar to that of CONVERT T TO 
ALTERNATE FLOATING-POINT FORMAT above. 

P OR T, P AND T, P XOR T, NOT T (Integer): The logical 
operations (OR, AND, EXCLUSIVE OR) are perfonned on the 
operands on ports P and T, and the result is placed on port F. 
NOT T is performed by XORing T with a word containing all 
ones (integer -1). When invoking NOT T, instruction bits 
PSEL3-PSEL0 must be set to 0011, thus selecting integer 
constant -1. 

SHIFT P LOGICAL Q PLACES (Integer): This operation 
logically shifts operand P by Q places. If the shift is Q places to 
the right, Q zeros are filled from the left. If the shift is Q places 
to the left, Q zeros are filled from the right. 



SHIFT P ARITHMETIC Q PLACES (Integer): This operation 
arithmetically shifts operand P by Q places. With a right shift, 
the result is sign extended Q places. With a left shift, zeros 
are filled from the right. 

FUNNEL SHIFT PT LOGICAL Q PLACES (Integer): The 

operands on ports P and T are concatenated to form a double- 
width operand PT, which is then shifted to the right or left by Q 
places; the 32- or 64-bit result is placed on port F. 

MOVE P (Floating-Point or Integer): The operand on port P 
is moved to port F. The operand is left unchanged, and only 
the sign flag is set. 

Operation Flags 

For each operation, the ALU produces thirteen flags that 
indicate operation status. Of the flags produced, a maximum 
of seven are relevant to any given operation. The relevant 
flags are placed in the status register, and the other flags are 
discarded. 

The ALU flags are: 

C — CARRY: Carry-out bit produced by integer addition, 
subtraction, or comparison. 

I — INVALID OPERATION: Input operands are unsuitable for 
the operation specified (e.g., <» * 0). 

R — RESERVED OPERAND: Reserved operand detected/ 
generated. 

S — SIGN: Result sign. 

U — UNDERFLOW: Result underflowed the destination for- 
mat. 

V — OVERFLOW: Result overflowed the destination format. 

W — WINNER: Indicates which of the two operands selected 
when performing Max/Min operations. 

X — INEXACT RESULT: Result had to be rounded to fit the 
destination format. 

Z — ZERO: Zero result. 

> , = , < , # — GREATER THAN, EQUAL, LESS THAN, 

UNORDERED: Used to report the result of a comparison 
operation. 

Table 6 lists the flags reported for each operation. 
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TABLE 6. ORGANIZATION OF FLAGS 



Operations 


Opcode 
l4-lo 


Flag Register 


MSB LSB 
7 6 5 4 3 2 1 


IEEE Non-arithmetic single-operand 

IEEE Operations using add 

IEEE Operations using multiply 

IEEE Compare 

IEEE Maximum, minimum, limit 

IFEE Convert/scale to integer 

IEEE Multiply/accumulate 

IEEE Round to integral value 

IEEE Reciprocal seed 

IEEE Convert to alt. f.p. format 

Ibbb Convert from alt. f.p. format 


00000 
00001 
00010 
00011 
001 Ox 
0011X 
01000 
01001 
01010 
01011 
01100 


S 
S 

s 
s 
s 
s 
s 
s 
s 
s 
s 


z 
z 
z 

z 
z 
z 
z 

z 
z 
z 


X 
X 
X 

> 

X 

X 

X 
X 


u 
u 
u 
< 
w 

u 

u 

u 
u 


V 
V 
V 

# 

V 
V 
V 
V 
V 
V 


R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 




DEC D Non-arithmetic single-operand 

DEC D Operations using add 

DEC D Operations using multiply 

DEC D Compare 

DEC D Maximum, minimum, limit 

DEC D Convert/scale to integer 

DEC D Multiply/accumulate 

DEC D Round to integral value 

DEC D Reciprocal seed 

DEC D Convert to alt. f.p. format 

DEC D Convert from alt. f.p. format 


00000 
00001 
00010 
00011 
001 Ox 
001 1x 
01000 
01001 
01010 
01011 
01100 


s 
s 
s 
s 
s 
s 
s 
s 
s 
s 
s 


z 
z 
z 

z 
z 
z 
z 
z 
z 
z 


X 
X 
X 

> 

X 

X 

X 
X 


u 
u 

< 
w 

u 

u 
u 
u 


V 
V 
V 

# 

V 
V 
V 
V 
V 
V 


R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 


1 
1 
1 


DEC G Non-arithmetic single-operand 

DEC G Operations using add 

DEC G Operations using multiply 

DEC G Compare 

DEC G Maximum, minimum, limit 

DEC G Convert/scale to integer 

DEC G Multiply/accumulate 

DEC G Round to integral value 

DEC G Reciprocal seed 

DEC G Convert to alt. f.p. format 

DEC G Convert from alt. f.p. format 


00000 
00001 
00010 
00011 
OOlOx 
0011X 
01000 
01001 
01010 
01011 
01100 


s 
s 
s 
s 
s 
s 
s 

8 

s 
s 
s 


z 
z 
z 

z 
z 
z 
z 
z 
z 
z 


X 
X 
X 

> 

X 

X 

X 
X 


u 
u 
u 
< 
w 

u 

u 
u 
u 


V 
V 
V 

# 

V 
V 

V 
V 
V 
V 


R 
R 
R 
R 
R 
R 
R 
R 
R 
R 
R 


1 

1 
1 
1 


IBM Non-arithmetic single-operand 

IBM Operations using add 

IBM Operations using multiply 

IBM Compare 

IBM Maximum, minimum, limit 

IBM Convert/scale to integer 

IBM Multiply/accumulate 

IBM Round to integral value 

IBM Reciprocal seed 

IBM Convert to alt. f.p. format 

IBM Convert from alt. f.p. format 


00000 
00001 
00010 
00011 
001 Ox 
001 1x 
01000 
01001 
01010 
01011 
01100 


s 
s 
s 
s 
s 
s 
s 
s 
s 
s 
s 


z 
z 

z 

z 
z 
z 
z 
z 
z 
z 


X 
X 
X 

> 

X 

X 

X 
X 


u 
u 

< 
w 

u 

u 
u 


V 
V 
V 

V 
V 
V 
V 
V 
V 


R 
R 


1 
1 


Integer Non-arithmetic single-operand 
Integer Sign transfer 
Integer Operations using add 
Integer Operations using multiply 
Integer Compare operations 
Integer Maximum, minimum, limit 
Integer Convert to float 
Integer Scale to float 
Integer Logical operations 
Integer Arithmetic shift 
Integer Funnel shift 


00000 
00000 
00001 
00010 
00011 
OOlOx 
00110 
00111 
lOOxx 
10100 
10101 


s 
s 
s 
s 
s 
s 
s 
s 
s 
s 
s 


z 
z 
z 
z 

z 
z 
z 
z 
z 
z 


> 

X 
X 


< 

w 

u 


V 
V 
V 
V 
V 

V 
V 


R 


C 
■ C 


Move operand 
Load mode register 


11000 

11111 


s 















Note: Unused flags assume the LOW state. 
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Master/Slave Operation 

Two Am29C327 processors can be tied together in master/ 
slave configuration, with the slave checking the results pro- 
duced by the master. A ll input and output signals of the slave, 
with the exception of SLAVE and MSERR, are tied to the 
corresponding sig nals of t he master. The master is selected 
by asse rting signal SLAVE LOW; the slave, by asserting signal 
SLAVE HIGH. 



The slave processor, by comparing Its outputs to the outputs 
of the master processor, performs a comprehensive ohecit of 
the operation of the master processor. In addition, the slave 
processor may detect open circuits and other faults in the 
electrical path between the master processor and the system. 
Note that the master processor still performs the comparison 
between its outputs and its own internally generated results, 
and is therefore able to detect faults in its output drivers. 
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APPENDICES 



APPENDIX A — DATA FORMATS 

The following data formats are supported: 32-bit integer, 64-bit 
integer, IEEE single-precision, IEEE double-precision, DEC F, 
DEC D, DEC G, IBM single-precision, and IBM double- 
precision. 



The primary and alternate floating-point formats are selected 
by mode register bits MO to M3. The user may select between 
floating-point operations and integer operations by means of 
instruction bit 15. 

The nine supported formats are described below: 



Integer Formats 
32-Bit Integer 

The 32-bit integer word is arranged as follows: 



Bit 31 30 29 28 27 26 25 7 6 5 4 3 2 10 



.231 230 gSS 2^8 ■^ ■^ ■^ g^ 2^2^ 2^ 2'' 2^ 2^ 2° 



The 32-bit word is interpreted as a two's-complement integer. 
For integer multiplications, the user has the option of interpret- 
ing integers as unsigned. An unsigned single-precision integer 
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has a format similar to that of the two's-complement integer, 
but with an MSB weight of 2^^ 



64-Blt integer 

The 64-bit integer word is arranged as follows: 



Bit6362 61605958 57 76543210 



263 262 26I 26O 259 2^8 2^7 2^ 2® 2^ 2'* 2^ 2^ 2^ 2° 



The 64-bit word is interpreted as a two's-complement integer. 
For integer multiplications, the user has the option of interpret- 
ing integers as unsigned. An unsigned double-precision inte- 



ger has a format similar to that of the two's-complement 
integer, but with an MSB weight of 2^. 



IEEE Formats 

IEEE Single-Precision 

The IEEE single-precision word is 32 bits wide and is arranged 
in the format as follows: 



31 


30 29 28 27 26 25 24 23 


22 21 20 19 18 


3 2 10 


s 


2? 26 25 2* 2^ ^ ^ 2° 


2I 2-2 i ^ 2^ . 


2"2°2"2^ 2"22 2-23 



sign 



biased exponent (e) 



fraction (f) 



The floating-point word is divided into three fields: a single-bit 
sign, an 8-bit biased exponent, and a 23-bit fraction. 

The sign bit is for positive numbers and 1 for negative 
numbers. Zero may have either sign. 

The biased exponent is an 8-bit unsigned integer representing 
a multiplicative factor of some power of two. The bias value is 
127. If, for example, the multiplicative value for a floating-point 



number is to be 2^, the value of the biased exponent is 
a + 127, where "a" is the true exponent. 

The fraction is a 23-bit unsigned fractional field containing the 
23 least-significant bits of the floating-point number's 24-bit 
mantissa. The weight of the fraction's most-significant bit is 
2"\ The weight of the least-significant bit is 2'^. 

An IEEE floating-point number is evaluated or interpreted as 
follows: 



If e = 255 and f 9^0 value = NaN 

If e = 255 and f = value = (-1)'=° 

If 0<e<255 value = (-1)^2®-^27(.,^ 

If e = and f^O value = (-1)^2-''2^(0.f) 

If e = and f = value = (-1)^0 



Not-a-Number 
Infinity 

Normalized number 
Denormalized number 
Zero 
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Infinity: Infinity can liave either a positive or negative sign. 
The interpretation of infiniti es is d etermined by the Affine/ 
Projective select input AFF/PROJ. 

NaN: A NaN is interpreted as a signal or symtml. NaNs are 
used to indicate invalid operations, and as a means of passing 
process status through a series of calculations. They arise in 



two ways: either generated by the Am29C327 to indicate an 
invalid operation, or provided by the user as an input. A 
signaling NaN has the MSB of Its fraction set to and at least 
one of the remaining fraction bits set to 1. A quiet NaN has the 
MSB of its fraction set to 1. 

The IEEE format is fully described in IEEE Standard 754. 



IEEE Double-Precision 

The IEEE double-precision word is 64 bits wide and is 
arranged in the format shown below: 



63 


62 61 60 . 


. 54 53 52 


51 50 49 48 47 


. . 3 2 10 


s 


jfO 29 28 . 


. 2^ 2' 2° 


2I 22 ^3 2* 25 . 


. . 2^9 250 251 g52 



sign 



biased exponent (e) 



fraction (f) 



The floating-point word is divided into three fields: a single-bit 
sign, an 11 -bit biased exponent, and a 52-bit fraction. 

The sign bit is for positive numbers and 1 for negative 
numbers; zero may have either sign. 

The biased exponent Is an 11 -bit unsigned integer represent- 
ing a multiplicative factor of some power of two. The bias 
value is 1023. If, for example, the multiplicative value for a 



floating-point numbei- is to be 2^, the value of the biased 
exponent is a-i- 1023, where "a" is the true exponent. 

The fraction is a 52-blt unsigned fractional field containing the 
52 least-significant bits of the floating-point number's 53-blt 
mantissa. The weight of the traction's most-significant bit is 
2"^ The weight of the least-significant bit is 2'^^. 

An IEEE floating-point number is evaluated or interpreted as 
follows: 



If e = 2047 and f^O value = Reserved operand Not-a-Numtier 

If e = 2047 and f = value = (- 1 j^" Infinity 

If 0<e<2047 value = (-1)S2«- ■'023(1 f) Normalized number 



If e = and f^O value = (-1)^2-1 "^ajg.f) 

If e = and f = value = (-1)^0 



Denormalized number 
Zero 



Infinity: Infinity can have either a positive or negative sign. 
The interpretation of Infiniti es is d etermined by the Affine/ 
Projective select input AFF/PROJ. 

NaN: A NaN is interpreted as a signal or symbol. NaNs are 
used to indicate invalid operations, and as a means of passing 
process status through a series of calculations. They arise in 



two ways: either generated by the Am29C327 to Indicate an 
invalid operation, or provided by the user as an input. A 
signaling NaN has the MSB of its fraction set to and at least 
one of the remaining fraction bits set to 1 . A quiet NaN has the 
MSB of its fraction set to 1. 

The IEEE format is fully described in IEEE Standard 754. 



DEC Formats 

DEC F 

The DEC F word Is 32 bits wide and is ananged in the format 
shown below: 



31 


302928272625 24 23 


22 21 20 19 18 . . . 


3 2 10 


s 


2? 26 gS 2-* 23 22 2' 2° 


22 2-3 f ^5 26 . . . 


2-21 2-22 223 ^4 


sign 


biased exponent (e) 


fraction <f) 





The floating-point word is divided into three fields: a single-bit 
sign, an 8-bit biased exponent, and a 23-bit fraction. 

The sign bit Is for positive numbers and 1 for negative 
numbers; zero has a positive sign. 

The biased exponent is an 8-bit unsigned integer representing 
a multiplicative factor of some power of two. The bias value is 
128. If, for example, the multiplicative value for a floating-point 
number is to be 2*. the value of the biased exponent is 
a + 128, where "a" is the true exponent. 

The fraction is a 23-bit unsigned fractional field containing the 
23 least-significant bits of the floating-point number's 24-bit 
mantissa. The weight of the fraction's most-significant bit is 
2"^. The weight of the least-significant bit is 2"24 



TB001070 

A DEC F floating-point number Is evaluated or interpreted as 
follows: 

If ei^Q value ^ (-1)^2®-! 26(0 1Q 

If s = and e = value "0 

If s=1 and e = value = DEC-Reserved Operand 

DEC-Reserved Operand: A DEC-Reserved Operand is inter- 
preted as a signal or symbol. DEC-Reserved Operands are 
used to indicate invalid operations and operations whose 
results have overflowed the destination format. They may also 
bie used to pass symbolic information from one calculation to 
another. 

The DEC formats are fully described in the VAX Architecture 
Manual. 
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DEC D 

The DEC D word is 64 bits wide and is arranged in the format 
shown below: 



63 


62 61 60 59. 58 57 56 55 


54 53 52 51 50 


3 2 10 


S 


^7 26 25 24 2^ 2^ 2^ 2° 


2-2 -3 24 2= 26 . 


. . 253 254 255 256 



Sign 



biased exponent (e) 



The floating-point word is divided into three fields: a single-bit 
sign, an 8-bit biased exponent, and a 55-bit fraction. 

The sign bit is for positive numbers and 1 for negative 
numbers; zero has a positive sign. 

The biased exponent is an 8-bit unsigned integer representing 
a multiplicative factor of some power of two. The bias value is 
128. If, for example, the multiplicative value for a floating-point 
number is to be 2^, the value of the biased exponent is 
a -I- 128, where "a" is the tnje exponent. 

The fraction is a 55-bit unsigned fractional field containing the 
55 least-significant bits of the floating-point number's 56-bit 
mantissa. The weight of the fraction's most-significant bit is 
2"^. The weight of the least-significant bit is 2"^^. 



fraction (f) 
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A DEC D floating-point number is evaluated or interpreted as 
follows: 

If ei-0 value- (-1)^2®- ^2>.1f) 

If s = and e = value = 

If s=1 and e = value = DEC-FReserved Operand 

DEC-Reserved Operand: A DEC-Reserved Operand is inter- 
preted as a signal or symbol. DEC-Reserved Operands are 
used to indicate invalid operations and operations whose 
results have overflowed the destination format. They may also 
be used to pass symbolic information from one calculation to 
another. 

The DEC formats are fully described in the VAX Architecture 
Manual. 



DEC G 

The DEC G word is 64 bits wide and is ananged in the format 
shown below: 



63 


62 61 60 . 


. 54 53 52 


51 50 49 48 47 


3 2 10 


s 


2IO 29 28 . 


. 22 2I 2° 


2-2 23 2* 25 26 . 


. . .250 25^252 253 



sign 



biased exponent (e) 



The floating-point word is divided into three fields: a single-bit 
sign, an 11 -bit biased exponent, and a 52-bit fraction. 

The sign bit is for positive numbers and 1 for negative 
numbers; zero has a positive sign. 

The biased exponent is an 11 -bit unsigned integer represent- 
ing a multiplicative factor of some power of two. The bias 
value is 1024. If, for example, the multiplicative value for a 
floating-point number is to be 2®, the value of the biased 
exponent is a-i- 1024, where "a" is the tme exponent. 

The fraction is a 52-bit unsigned fractional field containing the 
52 least-significant bits of the floating-point number's 53-bit 
mantissa. The weight of the fraction's most-signifjrant bit is 
2"^. The weight of the least-significant bit is 2 



fraction (t) 



TB001090 



■53 



A DEC G floating-point number is evaluated or interpreted as 
follows: 

If e#0 value = (-1)=2®-'°24(o, If) 

If s = and e = value = 

If s = 1 and e = value = DEC-Reserved Operand 

DEC-Reserved Operand: A DEC-Reserved Operand is inter- 
preted as a signal or symbol. DEC-Reserved Operands are 
used to indicate invalid operations and operations whose 
results have overflowed the destination format. They may also 
be used to pass symbolic information from one calculation to 
another. 

The DEC formats are fully described in the VAX Architecture 
Manual. 
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IBM Formats 

IBM Single-Precision 

The IBM single-precision word Is 32 bits wide and is arranged 
in the format shown below: 



31 


30 29 28 27 26 25 24 


23 22 21 20 19 18 


3 2 10 


s 


2« 2^ 2^ 23 22 2^ 2° 


g1 g2 ^3 ^4 26 26 _ 


. . 2-21 2-22 2-23 2-2'* 



Sign 



biased exponent (e) 



The floating-point word is divided into three fields: a single-bit 
sign, a 7-bit biased exponent, and a 24-bit fraction. 

The sign bit is for positive numbers and 1 for negative 
numbers; a True-zero has a positive sign. 

The biased exponent is a 7-bit unsigned integer representing a 
multiplicative factor of some power of 16. The bias value is 64. 
If, for example, the multiplicative value for a floating-point 
number is to be 16^, the value of the biased exponent is 
a + 64, where "a" is the true exponent 

The fraction is a 24-bit unsigned fractional field containing the 
24 least-significant bits of the floating-point number's 25-bit 
mantissa. The weight of the fraction's most-significant bit is 
2"'. The weight of the least-significant bit is 2'^*. 



fraction (f) 



TB001100 



An IBM floating-point number is evaluated or interpreted as 
follows: 

value = (- 1)^1 6® -®*(0.0 

Zero: There are two possible classes of representations for 
zero. Since there is no leading bit in the IBM format, the range 
of the IBM fraction is equal to or greater than zero and less 
than one. If an operation causes the fraction of the result to 
cancel exactly, then the result is a floating-point zero. A True- 
zero has a positive sign, a biased exponent of zero, and a 
fraction of zero. 

The IBM fomat is fully described in the IBM Syslem/370 
Principles of Operation Manual. 



IBM Double-Precision 

The IBM double-precision word is 64 bits wide and Is arranged 
in the fornrat shown below: 



63 62 61605958 57 56 55 54 53 52 51 50 



26 25 gt 23 22 2I 2° 



sign 



biased exponent (e) 



jl 2-2 23 2* 2^ 26 



The floating-point word is divided into three fields: a single-bit 
sign, a 7-bit biased exponent, and a 56-bit fraction. 

The sign bit is for positive numbers and 1 for negative 
numbers; a True-zero has a positive sign. 

The biased exponent is a 7-bit unsigned integer representing a 
multiplicative factor of some power of 16. The bias value is 64, 
If, for example, the multiplicative value for a floating-point 
number is to be 16*, the value of the biased exponent is 
a + 64, where "a" Is the frue exponent. 

The fraction is a 56-bit unsigned fractional field containing the 
56 least-significant bits of the floating-point number's 57-bit 
mantissa. The weight of the fraction's most-significant bit is 
2"\ The weight of the least-significant bit is 2"^. 



g53g54g55 256 



fraction (f) 



TBO01110 



An IBM floating-point number is evaluated or interpreted as 
follows: 

value -(-1)^16«-^(0.f) 

Zero: There are two possible classes of representations for 
zero. Since there is no leading bit in the IBM format, the range 
of the IBM fraction is equal to or greater than zero and less 
than one. If an operation causes the fraction of the result to 
cancel exactly, then the result is a floating-point zero. A Tnie- 
zero has a positive sign, a biased exponent of zero, and a 
fraction of zero. 

The IBM format is fully described in the IBM System/370 
Principles of Operation Manual, 
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APPENDIX B — ROUNDING MODES 

The Am29C327 provides six rounding modes for floating-point 
operations, and for integer multiplication: 



RM2 


RM1 


RMO 


Round Mode 











1 
1 




1 



1 


Round to Nearest (IEEE) 
Round to Minus Infinity 
Round to Plus Infinity 
Round to Zero 


1 
1 
1 





1 




1 


Round to Nearest (DEC) 
Round Away From Zero 
Illegal Value 



Round to Nearest IEEE (Unbiased) 

The infinitely precise result of an operation is rounded to the 
closest representable value in the destination format. If the 
infinitely precise result Is exactly halfway between two repre- 
sentations, it is rounded to the representation having a least- 
significant bit of zero. This rounding mode conforms to the 
"round to nearest" mode described in the IEEE Floating-Point 
Standard. 

Round to Minus Infinity 

The infinitely precise result of an operation is rounded to the 
closest representable value in the destination format that is 
less than or equal to the infinitely precise result. This rounding 
mode conforms to the "round to minus infinity" mode de- 
scribed in the IEEE Floating-Point Standard. 



Round to Pius infinity 

The infinitely precise result of an operation is rounded to the 
closest representable value in the destination format that is 
greater than or equal to the infinitely precise result. This round 
mode conforms to the "round to plus infinity" mode described 
in the IEEE Floating-Point Standard. 

Round to Zero 

The infinitely precise result of an operation is rounded to the 
closest representable value in the destination format whose 
magnitude is less than or equal to the infinitely precise result. 
This rounding mode conforms to the "round to zero" mode 
described in the IEEE Floating-Point Standard. 

Round to Nearest DEC (Biased) 

The infinitely precise result of an operation is rounded to the 
closest representable value in the destination format. If the 
infinitely precise result is exactly halfway between two repre- 
sentations, it is rounded to the representation having the 
greater magnitude. This rounding mode is used by DEC VAX 
computers. 

Round Away from Zero 

The infinitely precise result of an operation is rounded to the 
closest representable value in the destination format whose 
magnitude is greater than or equal to the infinitely precise 
result. 

A graphical representation of these rounding modes is shown 
in Figures B1-1 and B1-2. 
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Figure B1-1. Graphical Interpretation of IEEE Round-to-Nearest, Round-to-Mlnus-lnflnity, and Round-to-Plus-lnflnlty Rounding Modes 
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Figure B1-2. Graphical Interpretation of Round-to-Zero, DEC Round-to-Nearest, and Round-Away-from-Zero Rounding iVIodes 



APPENDIX C — ADDITIONAL OPERATION 
DETAILS 

Differences Between IEEE Floating-Point 
Standard and Am29C327 IEEE Operation 

The IEEE floating-point standard recommends that a trapped 
overflow on conversion from a binary format return a result in 
that or a wider format, rounded to the destination format. The 
Am29C327 returns an operand in the destination format, 



rounded to that format. Note that trapped operation is an 
optional aspect of the IEEE floating-point standard, and as 
such, is not necessary for compliance. 



Differences Between IBM 370 Floating-Point 
Arithmetic and Am29C327 IBM Operation 

For all arithmetic operations, the Am29C327 in general will 
produce a more precise result than the IBM 370. 



Differences Between DEC Floating-Point 
Arithmetic and Am29C327 DEC Operation 

The Am29C327 and DEC VAX floating-point formats contain 
identical information, but the sub-fields of the floating-point 
words are arranged differently: 



The Am29C327 DEC F format is: 
sign -bit 31 
exponent - bits 30-23 
mantissa - bits 22-0 

The Am29C327 DEC D format is: 
sign -bit 63 
exponent - bits 62-55 
mantissa - bits 54-0 



The VAX format is: 

sign -bit 15 
exponent - bits 14-7 
mantissa - bits 6-0, 
bits 31 - 16 

The VAX format is: 

sign -bit 15 
exponent - bits 14-7 
mantissa - bits 6-0, 

bits 31-16, 
bits 47-32, 
bits 63 - 46 
bit 6 = MSB, 
bit 48 - LSB 
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ABSOLUTE MAXIMUM RATINGS OPERATING RANGES 

Storage Temperature . -65 to +150°^ nnmmfimial tc:) DfivinfiR 




Ambient Temperature (Xp) 


Temoerature (TaK 





1 to +70°n 


Under Bias -55 to +125°C Supply Voltage (Vr^"! +R v + •=,<•/„ 1 


Supply Voltage to Min 

Ground Potential Continuous -0.5 to +7.0 V Max 




....+ 4.75 V 


S^rr H^rstl .-0.5 V to +V.. M^. ^^"'^'V (M) Devic_es 




-0.5 to +5.5 


lemperature (I a). 
" SuddIv Vnltaae (V 


-55 to +1JS-C 1 


DC Input Voltage. 


■^rA + 










DC Output Current, Into Outputs 30 mA ' "r' " ' '' '''^' 

DC Input Current -10 to +10 mA i^^^^ 

Stresses above those listed under ABSOLUTE MAXIMUM o„„,«„„ „„„- ^„r^^' k^^ /•„•.. t,^^ „ 

„.^...„^ . J . ^ ., ^ .. ,^ Operating ranges define, orose limits between 

RATINGS may cause permanent device failure. Functionality f^^^tionaliiy of the devii^is guaranteed. 

at or above these limits is not implied. Exposure to absolute ' .,-■:.■," 

maximum ratings for extended periods may affect device 

reliability. ,■■,;■■-_-■' 

DC CHARACTERISTICS over operating range unless otherwise speciffed 


....+4.5 V 
.... + 5.5 V 

which the 


Parameter 
Symbol 


Parameter 
Description 


Test CoiKJitions 
(NQte>, ■ 


Min. 


Max. 


Unit 


Vqh 


Output HIGH Voltage 


Vcc - Mm ■ - ' 

V|N = V,L or; V|H 

ViL = aav 

Vw»«2.0 V 
i4)--04 mA 


2.4 




V 


Vol 


Output LOW Voltage "% 


ifVcc->lin 
(.-,% = V|L or V|H 
;*(.-0.8 V 
S-Vlh = 2.0 V 
'*''IOL-'t.O mA 




0.5 


V 


V|H 


Input HIGH Level ,. -*■; 


Guaranteed Input Logical- 
HIGH Voltage for All Inputs 


2.0 




V 


V|L 


Input LOW Level ._, V" y-' 


Guaranteed Input Logical-LOW 
Voltage for All Inputs 




0.8 


V 


V| 


Input Clamp Voltage;-, V ^, 


Vcc = Min. 
t|N = -18 mA 




-1.5 


V 


l|L 


Input LOW Currenti,- ''' 


Vcc = Max. 
ViN-0.4 V 




-0.4 


mA 


l|H 


Input HIGH^CuRBrtr"'" 


Vcc = Max. 
V|N = 2.4 V 




75 


/iA 


l| 


Input HlGM-Ourrent 


Vcc = Max. 
V|N = 5.5 V 




1 


mA 


lOZH 


OfS|late'#1igh-lmpedance) Output 


Vcc -Max. 


Vo - 2.4 V 




25 


,iA 


lOZL 


Vo - 0.4 V 




-25 


Isc 
(Note 2) 


dotput Short-Circuit Current 


Vcc = Max. 
Vo-0 V 
All Outputs 


-3 


-30 


mA 


Ice 

(Note 3) 


Power Supply Cunent 




COM'L 




300 


mA 


MIL 




350 


ICCQI 
(Note 4) 


Quiescent Power Supply Current 




COM'L 






mA 


MIL 






IC0Q2 
(Note 5) 


Quiescent Power Supply Current 




COM'L 






mA 


MIL 






Notes: 1. For conditions shown as Min. or Max., use the appropriate value specified under Electrical Characteristics for the applicable device type. 

2. Not more than one output should be shorted at a time. Duration of the short-circuit test should not exceed one second. 

3. loc is measured with clock frequency = 8 MHz and with outputs disabled. Inputs should be presented with random logic-HIGHs and 
LOWS to assure the toggling of internal nodes. 

4. V|N > ViH, V|N < V|L 

5. V|N > Vcc - 0.2 V, V|N < C.2 V 
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SWITCHING CHARACTERISTICS over operating range unless otherwise specified 


No. 


Parameter Description 


Test Conditions 


Min. 


Max. 


Unit 


1 


CLK Period 

Flow-Through Mode 

Multiply-Acoumulate 
All Other Operations 

Single-Pipelined Mode 

Multiply-Acoumulate 
All Other Operations 

Double-Pipelined Mode 

Multiply-Accumulate 


(Note 1) 


360 
240 

240 
120 

120 


DC 
DC 

DC 
DC 

DC 


ns 

ns 

ns 
ns 

ns 


2 


CLK LOW Time 




''%» 




ns 


3 


CLK HIGH Time 




%*^ 




ns 


4 


CLK Rise Time 


(Note 2) 


%% 




ns 


5 


CLK Fall Time 


(Note 2) . , V 


-# 




ns 


6 


Data/Instruction Setup Time 


(Note 3) ;• "*"., ■^ 


'' 15 




ns 


7 


Data/Instruction Hold Time 


(Note 3) ■- '-X 







ns 


B 


Control Lines Setup Time 


(Note*-,,, 


15 




ns 


9 


Control Lines Hold Time 


(N5^ 4^ ' 







ns 


10 


Fo _ 31 CLK-to-Outp'Jt- Valid 
F Register Clocked 


,*ao/-' 




20 


ns 


11 


FLAG, - 6 SIGN CLK-to-Output-Valid 
Register Clocked 






20 


ns 


12 


Fo-31 CLK-to-Output- Valid 
F Register Transparent 

Flow-Through Mode ^* 
Multiply-Accumulate jj^ 
All Other Operatmns "^J" 

Single-Pipelined Mode J^^i^ 

Multiply-Accumulate ^k"^^ 
All Other Operations ^. ' 

Double-Pipelined Mode ..£ J 
Muittpiy-Accumutate j„,%^ ^,'' 






380 
260 

260 

140 

140 


ns 
ns 

ns 
ns 

ns 


13 


FLAG, _ 6 SIGN f ~ ^^ 
CLK-to-Output-Valid ,. % 4* 
S Register Transparent 'l.tiJ''"^** 
Flow-Through Mode *^«*«#n- 

Multiply-Accumulate ^^ %. 

All Other Operations T"^, 

Single-Pipelined Mode < *{ 

Multiply-Accumulate ^ , 
All Other Operations 

Double-Pipelined Moate. 

Multiply-Accumulate 






380 
260 

260 
140 

140 


ns 
ns 

ns 
ns 

ns 


14 


OEF, OES. Disabla\TTi^ 
HIGH to Z 






15 


ns 


15 


OEF, OES, Disable TTtne 
LOW to Z 






15 


ns 


16 


0E(=, OES. Enable Time 
Z to HIGH 






20 


ns 


17 


OEF, OES, Disable Time 
Z to LOW 






20 


ns 


18 


FSEL to Fo-3, 






20 


ns 


19 


MSERR Data-to-Valid Delay 






20 


ns 


Notes: 1. CLK switching characteristics are made relative to 2.5 V. 

2. CLK rise time and fall time measured between 0.8 V and (Vcc-1.0 V). 

3. Data/Instruction signals include Ro-31. So-3l. S/DR, S/BS, S/DF, RMo-2, PSELo-3, QSELo_3, TSEL0-3 and 10-13. 

4. Control signals include ENR, ENS, ENF, ENRF, RFSEL0-2, FSEL, ENI, OEF, and OES. 

Conditions: A. All inputs/outputs except CLK are TTL-compatible lor V|h. Vil, and Vql. 

B. All outputs are driving 80 pF unless othemvise noted. 

C. All setup, hold, and delay times are measured relative to CLK at Vcc''2 volts unless otherwise noted. 
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SWITCHING TEST CIRCUITS 




R, =300 



R2=3K. 



^ R) = 300 n 



H4- 



TCR01331 



A. Three-State Outputs 



B. Normal Outputs 



Notes: 1. Cl = 50 pF includes scope probe, wiring, and stray capacitances without device in test fixture. 

2. Si, S2, S3 are closed during function tests and all AC tests except output enable tests. 

3. Si and S3 are closed while 82 is open for tpzH test. 

4. Cl = 5.0 pF for output disable tests. 
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SWITCHING TEST WAVEFORMS 



MTA 
INPOT" 




TUaNG 
INPOT ' 



k^ 




£ 



■ 3 V 

- 1.5 V 

■ V 

■ 3 V 

■ 1.S V 

■ V 



WFR02970 



LOW HIGH LOW 
PULSE 



H1GHL0WHIGH_ 
PULSE 



/=\ 



\=z/ 



WFB02790 



Notes: 1. Diagram shown for HIGH data only. Output 
transition may be opposite sense. 
2. Cross-hatched area is don't care condition. 



Setup, Hold, and Release Times 



Pulse Width 



-J=l 



jf=^: 



\=f 



WFR02980 



OUTPUT 
NORMALLV 

LOW 






OUTPUT 
NORMAL LV 

H'CH $2 OPEN 



£ 



± 



j^^:: 



"^"f~ ■■ 



05 V 

WFR026$0 



Notes: 1. Diagram shown for Input Control Enable-LOW 
and Input Control Disable-HIGH. 
2. Si, Sj and Ss of Load Circuit are closed except 
where shown. 



Propagation Delay 



Enable and Disable Times 
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SWiTCHiNG WAVEFORMS 
KEY TO SWITCHING WAVEFORMS 



m. 



DONT CARE; 
ANY CHAMGE 
PERMITTED 



WM. 



WILL BE 
CHANGING 
FROM H TO L 



WILL BE 
CHANGING 
FROM L TO H 



CHANGING: 

STATE 

UNKNCWN 



CENTER 
LINE IS HIGH 
IMPEDANCE 
"OFF" STATE 



-o- 



-©- 



f 



CLK 



/ 



-©- 



^©-H k©-* 



Input Clock Timing 



WF025010 
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MsnrRucnoN 



SWITCHING WAVEFORMS (Cont'd.) 



•-®-4»® 



AJ^ 



»©-H*-©-« 



X«Lsm ] ' »M.w y 
1 \ l\ 



' "IBW V 

^ r 



\ \ \ 



o 



-I 



■)CE)C^D( 



}( 



WF0Z5OZ0 



Timing of Operations with F Register and Status Clocked. Assumes 32-Blt Bus, Single-Cycle, 
LSW-First Input Mode and Flow-Through Operation 



4-178 



SWITCHING WAVEFORMS (Cont'd.) 



S0.31 



INSTRUCTION 

Lines 



HIR1 
1JJ5 > 



Fua,, 

SICN 



-o- 



); 



"msw Y 



)c:d< 

'^©-4^®— 



XIEDC^IX 



X 



X 



WF025030 



Timing of Operations with F-Register and Status Register in Feedthrough Mode. Assumes 32- 
Bit Bus, Singie-Cycle, LSW-First Input Mode and Flow-Through Operation. 



4-179 



SWITCHING WAVEFORMS (Cont'd.) 



CLK 



ENRF 



-(£>—*' 



'*-®-*- 



RFSELo.2 



Register File Controi Timing 



OEF 



HIGH LEVEL 



Fo-31 



-@- 



-© ► 



f'o-ai 



LOW LEVEL 



^ 

1.SV 



4 fieV— ♦ 



\ 



f. 



V 



HIGH IMPEDANCE 



/ 



-©- 



HIGH IMPEDANCE 



x; 



Enable/Disable Timing for Fo-31 



"OH 



\- Vni +O.SV 



voL-t 



WF025050 
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SWITCHING WAVEFORMS (Cont'd.) 



^^ 



OES 



HK3H LEVEL 



FLAGS, SIGN 



-© » 



OH 
• 1.5 V 



\ 



\- - - - - . . "~."J^ 



HIGH IMPEDANCE 



-©- 



FLAGS, SIGN 



7 






LOW LEVEL 



Enable/Disable Timing for FLAGi-e and SIGN 



Vol 



/ VOH.0.5V 



^ @— 

"high IMPEDANCE 

-Jt -\- Vru +0.5 V 



VOL+ 



WF025060 



fsel 



■^0-31 



I 



-@- 



WF025070 



Output Selection Timing 



MASTER/SLAVE ERROR 



^\ 



■^0-31' 

FLAGS, 

SIGN 



MSERR 



1 y 



/ 



-@— » 



WF025080 



Master/Slave Timing (Assumes SU\VE Mode) 
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Advanced Micro Devices does not support, maintain, nor guarantee the performance of tliird-party products described in ttiis chapter. 



CHAPTER 5 

Support Tools 



U 



Advanced Micro Devices is recognized as the pioneer 
and leading supplier of fast microprogrammable bit -slice 
and related integrated circuits used in a wide variety of 
fiigh-performance systems 

Because of their flexibility, these microprogrammable 
iCs require a deeper understanding of hardware than 
required by a typical MOS microprocessor. But there is 
no reason to shy away from microprogramming: it is not 
difficult, and there are several hardware and software 
tools available. 

Tools that help the systems engineer design his system 
can be in the form of hardware, software, written materi- 
als, and even professional advice. The importance of 
support to any design approach, and the relative difficulty 
of microcoded design, require a detailed explanation. 

As more support is provided to the customer, ease-of- 
design improves and time-to-marl<et decreases. The 
design process becomes less tedious, risk is reduced, 
and a lower sl<ill level is required of the designer to 
implement a successful system. In general, the more 
rigid a device family becomes (i.e., fixed architecture/ 
fixed instruction set), the easier it is to support. , 

When assessing the support available for a design 
approach, considerations needto begiven tothe realities 
of the situation. For instance, building blocks offer a 
flexibility in architecture and programming that can only 
be equaled in gate arrays (which can be even more 
versatile). The informed engineer would not ask the 
question, "Can I get compiler support for what I build with 
gate arrays?" The answer would obviously be, "Only if 
you emulated something that was already supported, or 
targeted a compiler to your new creation." Until tools 
become available that automatically generate compilers, 
it will remain the case that more flexible approaches get 
you ctoser to the hardware and away from higher level 
language, and usually result in better performance. 

It is impossible to even imagine all of the various ways a 
microcoded system might be constaicted. Further, since 
the architecture is not fixed, it is not possible to pre-define 
acompileror assemblerforthe system. If thefull flexibility 
of the microprogrammed-building-block approach is to 
be maintained, then a penalty must be paid in terms of 
a lack of high-level language support. Fortunately, a 
good meta assembler greatly alleviates the program- 
ming task. Of course, once a system is defined, a 
compiler may be developed, but not cheaply. With these 
tradeoffs now in mind, we can present tools available to 
the Am29300/29C300 family. 



5.1 Am29C300 EVALUATION BOARD 

The Am29C300 Evaluation Board is an educational tool 
to help the user understand the Am29C300 32-bit build- 
ing-block family. With all the major devices of the 
Am29C300 family and an on-board debug monitor, the 
board provides an excellent tool for those who would like 
to learn more about the Am29C300 family. A block 
diagram of the board is shown in Figure 5-1 . 

The Ixtard consists of two systems: the 80188 and 
Am29C300 system. The 80188 system is a front-end 
processor which provides the necessary interface be- 
tween the board and external sources, such as a CRT 
terminal. Through a parallel interface betweenthe 80188 
system and the Am29C300 system, the 80188 system 
can control and monitor the activity of the Am29C300 
system, which is a 32-bit system with three major parts: 
a computer control unit, an execution unit, and memory. 

Am29C300 System 

As a standard computer architecture, the computer 
control unit provides all the control signals for the 
Am29C300 system. It includes several major hardware 
logics: sequencer (Am29C331), writable control store, 
pipeline register, interaipt controller, and macro instruc- 
tion register. Its operation is a very standard procedure. 
First, it fetches and stores a macro instruction into the 
macro-instmctlon register; then , the opcode of the macro 
instruction is decoded to find a correct microroutine for 
the macro instruction. Finally, the selected micro-routine 
controls the operations of the execution unit and the 
memory. 

With the building blocks of the Am29C300 family, a 
powerful execution unit has been implemented on the 
board. The execution unit is able to handle 32-bit arith- 
metic and logic operations, multi-precisran multiplteatlon 
and division, and single-precision floating-point calcula- 
tions within a reasonable time period. Also, the execution 
unit has 64 32-bit registers in which to store data. The 
following Am29C300 building blocks have been included 
in the execution unit: 

• Am29C334 - 64 x 18 Bit Dual-Access Four-Port 
Register File 

• Am29C332 - 32-Bit Arithmetic Logfc Unit 

• Am29C323 - 32-Bit Parallel Multiplier 

• Am29C325 - Single-Precision Floating-Point 
Processor 
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Figure 5-1. Am29C300 Evaluation Board 
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Figure 5-2. Am29C300 EVB Microcode BIT Map 
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The memory architecture is very straighttorward. K in- 
cludes 12 static RAMs and a control PAL. Three bits of 
the microcode are decoded by the control PAL to gener- 
ate chip selects and write pulses for the RAMs. A register 
in the execution unit should act as a program counter to 
provide addresses for the RAMs. 

Microcode 

The 96-bit wide microcode is divided into five major 
fields: sequencer and interrupt field, register A field, 
register B field, execution field, and control field. A 
detailed microcode fornfiat is shown in Table 5-1 . 

Monitor 

The monitor of the Am29C300 evaluation board is inple- 
mented in C and controlled by the 80188 system. It 
provides a limited microcode assembler and disassem- 
bler, a download and upload utility, and a microcode 



debugger. The debugger includes varbus useful fea- 
tures such as single step, break point, and display of 
register contents. 

5.2 Am29300 TEST BOARD 

With the increasing complexity of integrated circuits, it is 
often necessary to check the functionality of an IC. The 
IBM PC board allows the user to functionally check any 
Am29300 family device by writing input test vectors. The 
software accompanying the board takes these input 
vectors one at a time, applies them to the device under 
test, clocks the device, and produces output vectors. 
Figure 5-3 shows the architecture of the board. As stated 
above, the intention Is to albw users to familiarize them- 
selves with the functionality of the part . AC specs cannot 
be verified. Sample input and output files for the 
Am29331 are also shown. 



Table 5-1 



32 Bits 



12 Bits 



12 Bits 



23 Bits 17 BHs 



Sequencer 


Register A 


Register B 


Execution 


Control 


& Interrupt 


(Source) 


(Source & 






Controller 




Destination) 






Am29C331 


Am29C334 A Port 


Am29C334 B Port 


Am29C332 


Am2925 


Am29114 






Am29C323 
Am29Ca25 






(»372A 5.2-1 

Figure 5-3. Am29300 Testboard - Block Diagram 
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Am29331 Input File 

socket 120 

63,76,120,96,95,83,82 

107,93,79,80,81,67 

69,94,68,62 

61,55,39,57,56,43,44,45,37,38,25,26 

65,60,48,28,64,53,40,27,58,52,41,14,59,47,42,1 

108,85,8 6,100,114,90,104,105,24,10,8,20,19,30,29,2 

109,98,99,88,115,103,117,106,12,23,22,33,6,18,4,15 

46,77 

97,111,113,101,102,91,92,119,13,36,35,21,7,31,17,16 

84,72,78,71 

73,74,89,34,66,75,11,3,118,110 

32,87,54,49,51,50,5,116,9,112,70; 

M M M M 
3 2 10 
T . . . . D 
SI I SI 33331 

/ HLNI5 31 5 

R OATN- 3210- 

CSFLVETI ST ....D 

PTCDENRO 00 00000 

specify base for each column 

S BBBBBBBQHH HHH H H H H HHHH HHHH B B HHHH B B B B QHH OHH 

Specify pin direction for each column 

% IIIIIIIIIIIIIIIII IIII IIII I I OOOO 000 000 
:RESET 

001 wOXOXXXXXX XXX X X X X XXXX XXXX X 0000 000 000 - 001 A 

CONTINUE, BRCC_D, CONTINUE 

002 W100010 30X XXX XXXX XXXX XXXX 0000 0000 000 000 - 001 A 

003 wlOOOlOOOO 001 XXXX 8971 XXXX 0000 0000 000 000 - 001 A 

004 W100010 30X XXX XXXX XXXX XXXX 0000 0000 OOO 000 - 004 A 

005 W100010 30X XXX XXXX XXXX XXXX 0000 0000 000 000 - 003 L 



A 


Y 


A 




1 


1 


/ - E E 




5 


/ 5 


I F R Q 




- 


C - 


N U R U V 


G 


A 


I E Y 


T L A C 


N 





N D 


A L R L C 


D 
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Am29331 Output File 

socket 120 

63,76,120,96,95,83,82 

107,93,79,80,81,67 

69,94,68,62 

61,55,39,57,56,43,44,45,37,38,25,26 

65,60,48,28,64,53,40,27,58,52,41,14,59,47,42,1 

108,85,86,100,114,90,104,105,24,10,8,20,19,30,2 9,2 

109,98,99,88,115,103,117,106,12,23,22,33,6,18,4,15 

46,77 

97,111,113,101,102,91,92,119,13,36,35,21,7,31,17,16 

84,72,78,71 

73,74,89,34,66,75,11,3,118,110 

32,87,54,49,51,50,5,116,9,112,70; 







M M M M 














3 2 10 












T 


. . . . D 


A 


Y 


A 




SI I 


S 1 


3 3 3 3 1 


1 


1 


/ - E E 




/ H L N I 5 


3 1 


5 


5 


/ 5 


I F R Q 




R A T N - 


- - 


3 2 10- 


- 


C - 


N U R U V 


G 


CSFLVETI 


S T 


. . . . D 


A 


I E Y 


T L A C 


N 


PTCDENRO 











N D 


A L R L C 


D 



specify base for each column 

& BBBBBBBQHH HHH H H H H HHHH HHHH B B HHHH B B B B QHH OHH 

specify pin direction for each column 

% IIIIIIIIIIIIIIIII IIII IIII I I 0000 000 000 

:RESET 

001 w 00 000 0000 0000 0000 10 3FF 000 - 001 



: CONTINUE, 


BRCC_ 


.D 


CONTINUE 
































002 w 1 





1 





30 000 











0000 


0000 








0001 













3FF 


000 ■ 


- 001 


003 w 1 





1 





00 001 











8971 


0000 








8971 













3FF 


000 ■ 


- 001 


004 w 1 





1 





30 000 











0000 


0000 








8972 













3FF 


000 - 


- 001 


004 w 1 





1 





30 000 











0000 


0000 








8973 













3FF 


000 ■ 


- 002 


004 w 1 





1 





30 000 











0000 


0000 








8974 













3FF 


000 - 


- 003 


004 w 1 





1 





30 000 











0000 


0000 








8975 













3FF 


000 - 


- 004 


005 w 1 





1 





30 000 











0000 


0000 








8978 













3FF 


000 - 


- 003 
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5.3 Am29300 DEFINITION FILE 
Introduction 

The definition file contains the description of the micro 
machine for which assemblies are to be performed. Its 
innate flexibility allows the assemblerto be retargeted to 
support any given bit-slice microprocessor machine and 
Instruction format. The definition file is composed of: 

• Instruction Definition 

• Macro Definitions 

The definition file is stored on a floppy disk and can be 
requested from your local AMD sales office. 

Instruction Definition 

The instruction def initton defines a name for the instruc- 
tion, the length of the instruction, the fields of the instruc- 
tion and variation in format, allowable values for each 
field, and default values for each field. 

The instruction definition contains: 

• Field Definitions 

• Case Definitions 

Field Definition 

A field in a microinstruction is a group of bits that are 
logically related and are manipulated as a unit. The form 
of the field definition is: 

<fielddef 1> <descript 1> 

<descript 2> (<const 1> : <id 1>, 
<coiist 2> : <id 2>, 



<const m> : <id in>) 

<fielddef i> is a name of a field definition to be defined. 
<const i> is an integer-valued expression of an identifier. 
<id i> defines a name of an iderrtifier. A descriptor 
<descript> specifies the size and location of the field and 
assigns valid values forthe field. Valid descriptors are as 
follows: 

Bits : Bits that make up a field 

Length: Length of a field 

Default: Default values for a field 

Values: Definitions of names for field values 



Invert: One's complement field values 

Complement: Two's complement field values 

Mask: Use low bits of value, ignore high 

order bits 

Reverse: Reverse order of bits in field 

Valid: A list of valid values for the field 

Display: Display mode for debugging 

The foltowing is an example of the field definition for the 
Am29332: 



Ain29332: length 
values 



(7) 

(H'OO'; 
H'Ol'; 



ZERO-EXTA 
ZERO-EXTB 



H'5F' : SMULFIRST) 



The name of the fiek) may be any sequence of charac- 
ters. Constants may be specified in hexidecimal, deci- 
mal, octal, binary, or ASCII characters. Each of the 
Values' definitions consists of a constant followed by a 
colon and a symbol that will represent the constant's 
value when assigned to the field. 

Case Definitions 

The case definrtion is used to describe multiple formats 
for the microinstruction word. A microinstruction may 
have different interpretations of certain fields, depending 
upon other fields. The case definition provides a way of 
making this fonn of differentiation formal. The specifica- 
tk}n is such that if the selector field has a specif k; value, 
only one of the alternate field definitions is valkl and all 
the others are undefined. 

The case statement Is introduced by 'case' and followed 
by an optional field selector field name. Following this are 
one or more case entries. A case e ntry consists of a value 
or list of values of the selector field and a 'begin-end' 
bbck containing the description of the fieMs that are 
defined for this value. 

The form of a case definitbn is as folbws: 

Case {<selector>} of 

<casevaluel> :begin 
<fielddescrs> 
end; 
<casevalue2> :begin 

<f ielddescrs> 
end; 
endcase; 

<selector> is an optional field that is set depending upon 
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which case branch is selected. <casevalue1> is a value 
of the selector that selects the branch and is used 
for verification. <fielddescrs> is a field definition. An 
example is: 

sel : length (1) ; 
case sel of 
: begin 

addr : length (8); 
cntrl : length (8) ; 
end; 
1 : begin 

data : length (8) ; 
end 
endcase; 

This structure corresponds to the following overlayed 
microconstruction: 

65432109876543210 (bil position) 



Ctrl 


addr 


s 
e 

1 


data 



Macrodefinitions 

Macrodefinition is a very simple language, consisting of 



the field assignment. It is based upon the instruction defi- 
nitions discussed above and is user-definable, depend- 
ing upon any particular architecture. 

All instructions are a sequence of phrases, each of which 
is either a field assignment or a macro call. The following 
is the form of macrodefinitions: 

macro <op> &<var 1> &<var 2> ; 

begin 
<fielddef l>=<id k:>, . . . , <f ielddef i> 

=S<var j> 
endm; 

<op> is a name of the macro. &<var j> is a macro variable 
that may be local to a particular macro or accessible by 
any other macro that defines the same global macro 
name. The following is an example for the Am2933l : 

macro call Sdest; 
begin 

data=Sdest, Am2 9331=CALL 

endm; 

Inthis case, the Am29331 is set for a subroutine call 
instnjction call and the microprogram branches to the 
address specified by Sdest. Otherconditions are default 
as given by the Am29331 instmction definition. 



AMDASM definitions for Am29114 Real Time Interrupt Controller 



WORD 4 



MCLR 
CHSR 
CCIR 
NOOP 
BSMK 
BCMK 
LDMK 
RDMK 
BSSR 
BCSR 
LDSR 
RDSR 
BSIR 
BCIR 
LDIR 
RDIR 



EQU 


H#0 


EQU 


H#l 


EQU 


H#2 


EQU 


H#3 


EQU 


H#4 


EQU 


H#5 


EQU 


H#6 


EQU 


H#7 


EQU 


H#8 


EQU 


H#9 


EQU 


H#A 


EQU 


H#B 


EQU 


H#C 


EQU 


H#D 


EQU 


H#E 


EQU 


H#F 



Master clear 

Clear highest in service reg 
Clear highest in interrupt reg 
No operation 
Set mask reg from D-Bus 
Clear mask reg from D-Bus 
Load mask reg from D-Bus 
Read mask reg to D-Bus 
Set in service reg from D-Bus 
Clear in service reg fr D-Bus 
Load in service reg from D-Bus 
Read in service reg to D-Bus 
Set interrupt reg from D-Bus 
Clear interrupt reg from D-Bus 
Load interrupt reg from D-Bus 
Read interrupt reg to D-Bus 



INT.CNTL: 



DEF 



4VH#3 



Default to no operation 
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AMDASM definitions for Am29331 Microprogram Sequencer 



WORD 14 



; Am29331 bit fields: 













- FC- Force 


continue 


1 




- CIN 


- Increment carry in 


2-7 




- 10-15 


- Instruction 


8 




- INTEN 


- Interrupt enable 


9 




- OE 


- D-Bus Output enable 


10- 


13 


- S0-S3 


- Test select 



; FC values: 








FCONT: 


EQU 


B#l 


; Force continue 


; CIN values : 




- 




CINCR: 


EQU 


B#0 


; Increment by one 


CNINCR : 


EQU 


B#l 


; Don't increment 



Condition control (COND) (14-15) 



TRUE: 
FALSE : 
ALWAYS : 



EQU 


B#00 


; Branch 


on true 


EQU 


B#01 


; Branch 


on false 


EQU 


B#10 


; Branch 


always 



Address source (ADDR) (12-13) 



D.BUS 
A. BUS 
MULTW 
STACK 



EQU B#00 

EQU B#01 

EQU B#10 

EQU B#ll 



Address source - D-Bus 
Address source - A-Bus 
Address source - Multiway 
Address source - Stack 



Sequencer operation (SEQ) (10-11) 



BRA: 
CALL 
EXIT 
DJMP 



EQU 


H#00 


; Branch 


EQU 


H#01 


; Call 


EQU 


H#10 


; Exit 


EQU 


H#ll 


; Decrement counter and jump 
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Sequencer special instructions (1,0-15) 



CONT : 


EQU 


6H#30: 


; Continue 


FOR.D: 


EQU 


6H#31 


; For D ... 


DECR: 


EQU 


6H#32 


; Decrement counter 


LOOP: 


EQU 


6H#33 


; Loop . . . 


POP.D: 


EQU 


6H#34 


; Pop stack to D 


PUSH.D: 


EQU 


6H#35 


; Push D on stack 


RESET. SP: 


EQU 


6H#36 


; Reset stack pointer 


FOR. A: 


EQU 


6H#37 


; For A . . . 


POP.C: 


EQU 


6H#38 


; Pop stack to Counter 


PUSH.C: 


EQU 


6H#39 


; Push Counter to stack 


SWAP: 


EQU 


6H#3A 


; Exchange Ctr and TOS 


STACK. C: 


EQU 


6H#3B 


; Push Ctr & Load Ctr D 


LOAD.D: 


EQU 


6H#3C 


; Load Ctr from D 


LOAD. A: 


EQU 


6H#3D 


; Load Ctr from A 


BSET: 


EQU 


6H#3E 


; Load Comp Reg from D 


CLEAR: 


EQU 


6H#3F 


: ; Disable Comparator 



Test conditions (S0-S3) 



TO: 

Tl: 

T2: 

T3: 

T4: 

T5: 

T6: 

T7: 

T8: 

CARRY: 

T9t 

SIGN: 

TIO: 

OVER: 

Til: 

ZERO: 

ULTB: 

ULT: 

LT: 

LE: 



EQU 


H#0 


Test 


TO 


EQU 


H#l 


Test 


Tl 


EQU 


H#2 


Test 


T2 


EQU 


H#3 


Test 


T3 


EQU 


H#4 


Test 


T4 


EQU 


H#5 


Test 


T5 


EQU 


H#6 


Test 


T6 


EQU 


H#7 


Test 


T7 


EQU 


H#8 


Test 


T8 — 


EQU 


H#8 




Carry 


EQU 


H#9 


Test 


T9 == 


EQU 


H#9 




Negative sign 


EQU 


H#10 


Test 


TIO == 


EQU 


H#10 




Overflow 


EQU 


H#ll 


■ Test 


Til == 


EQU 


H#ll 




Zero or Equal 


EQU 


H#12 


• C+Z 


Jns LT, borrow 


EQU 


H#13 


• ~C+Z 


Uns LT 


EQU 


H#14 


■ N * 


V - Signed LT 


EQU 


H#15 


; (N - 


V) + Z - LE 
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; Definitions for conditional sequencer operations 

(interrupts disabled) 

SEQ: DEF B#0,B#1,2VB#11,2VB#00,2VB#00,B#0,B#1, 4VH#0 

FC CIN COND ADDR SEQ INTEN DOE TEST 

; (interrupts enabled) 

SEQI: DEF B#0,B#1,2VB#11,2VB#00,2VB#00,B#1,B#1, 4VH#0 

FC CIN COND ADDR SEQ INTEN DOE TEST 

; Definitions for special sequencer operations 

; (interrupts disabled) 

SSEQ: DEF B#0,B#1, 6VH#30 : ,B#0,B#1, 4VH#0 

FC CIN 10-15 INTEN DOE TEST 



; (interrupts enabled) 

SSEQI: DEF B#0,B#1, 6VH#30 : ,B#1, B#l, 4VH#0 

FC CIN 10-15 INTEN DOE TEST 

END 
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MCASM (Microtec Assembler) 

Definitions for Am29323 32-bit Parallel Multiplier 

*************** ************■k***■|,■>:■k■k^,^,^,^,^,^,^,^,^,^,^,^,^,^,^r^,^,^,i,^:^,i,^,^,^,^,^,^,^^,i,^^^,^^ 



rnd: 



length (1), { Round control 

values (0 : inactive, 1 : active) , 
default (inactive) ; 



format: 



psel; 



length (1), 

values (0 : fractional, 

default (signed) ; 

length (2), 
values (0 : temp, 

1 : low, 

2 : high, 

3 : none) , 
default (none) ; 



{ Format adjust 
1 : signed) , 



( Output control } 

{ Temp reg } 

{ Lower half } 

{ Upper half } 

{ No output } 



ace: 



length (2), 
values (0 : pass, 

1 : accmti, 
3 : shift), 
default (pass) ; 



( Accumulator control 



xsel: 



length (1), 

values (0 : XB, 1 : XA) , 

default (XA) ; 



( Select X register } 



tcx: 



ftx: 



enx: 



length (1), 

values (0 : unsigned, 1 : signed), 

default (signed) ; 



{ X mode control ( 



length (1), 

values (0 : registered, 

default (registered) ; 

length (2), 
values (0 : both, 

1 : XA, 

2 : XB, 

3 : none) , 
default (none) ; 



( Peedthru control for X regs) 
1 : transparent). 



{ Load XA and XB regs } 



ysel: 



length (1), 

values (0 : YB, 1 : YA) , 

default (YA) ; 



{ Select Y register ) 



toy: 



{ Y mode control 



length (1), 

values (0 : unsigned, 1 : signed), 

default (signed) ; 
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fty: 



length (1), { Feedthru control for Y regs} 

values (0 ; registered, 1 : transparent) , 
default (registered) ; 



eny: 



length (2), 
values (0 : both, 

1 : YA, 

2 : YB, 

3 : none) , 
default (none) ; 



{ Load YA and YB regs 



1 



tsel: 



length (1) , 
values (0 : low, 

1 : high) , 
default (low) ; 



{ Temporary reg load select } 



{ Lower half 
{ Upper half 



ent : 



length (1), 

values (0 : load, 1 

default (hold) ; 



{ Load temporary reg } 



hold) , 



length (1), 

values (0 : load, 1 

default (hold) ; 



( Load instruction reg 



hold) , 



enp: 



length (1), 

values (0 : load, 1 

default (hold) ; 



{ Load accumulator} 



hold) , 



fti: 



length (1) , 



( Feedthru control for inst reg ) 



values (0 : registered, 1 : transparent) , 
default (registered); 



ftp: 



length (1) , 

values (0 : registered, 

default (registered) ; 



{ Feedthru control for accimi } 
1 : transparent). 
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*********************************************************************** 

MCASM (Microtec Assembly) 

Macros for flin29323 32-bit Parallel Multiplier 

*********************************************************************** 
*********************************************************************** 

Load X Register 

*********************************************************************** 

macro loadx &X smode; 
begin 

output ("enx = 4X, tcx = Stnode") ; 
end 

*********************************************************************** 

Load Y Register 

*********************************************************************** 
macro loadY SY Smode; 
begin 

output ("eny = &Y, tcy = Smode") ; 
end 

*********************************************************************** 

Load Temp Register 

*********************************************************************** 
macro loadT Smode; 
begin 

output ("ent = load, tsel = Smode") ; 
end 

*********************************************************************** 

Select X s Y registers 

*********************************************************************** 
macro selXY SX &Y; 
begin 

output ("xsel = &X, ysel = SY") ; 
end 

*********************************************************************** 

Multiplier function 

*********************************************************************** 
macro mul SA smode; 
begin 

output ("ace = SA, enp = load, psel = Smode, eni = load") ; 
end 
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^jijjjt*******************************************************************} 

} 
} 
) 
) 

{ LSB Position or shift count 



MCASM (Microtech Assembler) 

Definitions for Am2 9332 32-bit Arithmetic Logic Unit 



position: length (6), 
default (0) ; 

width: length (5), 

default (31) ; 

case of 



{ Width of field 



begin 






b_width: length (2), 


{ 


Byte width of data } 


values ( 


four. 




: 


long. 




1 : 


one. 




1 : 


byte. 




2 : 


two. 




2 : 


short. 




3 : 


three) , 




default (four) ; 




fiin29332: length (7) 


, { Instruction } 


values (H'OO 


' : ZERO-EXTA, { 


Zero extend A } 


H'Ol' 


: ZERO-EXTB, 1 


Zero extend B } 


H'02' 


: SIGN-EXTA, { 


Sign extend A } 


H'03' 


: SIGN-EXTB, I 


Sign extend B } 


H'04' 


: PASS-STAT, ) 


Pass status to Y } 


H'05' 


: PASS-Q, 


Pass Q reg to Y } 


H'06' 


: LOADQ-A, 


Load A into Q } 


H'07' 


: LOADQ-B, 


Load B into Q } 


H'08' 


: NOT-A, 


Not A } 


H'09 


: NOT-B, 


Not B } 


H'OA' 


: NEG-A, 


2 ' s complement A } 


H'OB' 


: NEG-B, 


2 ' s complement B } 


H'OC 


: PRIOR-A, 


; Output priority A } 


H'OD' 


: PRIOR-B, 


; Output priority B } 


H'OE' 


: MERGE A-B, \ 


Merge A with B ) 


H'OF' 


: MERGEB-A, 


Merge B with A } 


H'lO 


: DECR-A, 


{ A - 1 } 


H'll 


: DECR-B, 


{ B - 1 } 


H'12 


: INCR-A, 


{ A + 1 } 


H'13 


: INCR-B, 


{ B + 1 } 


H'14 


■ : DECR2-A, 


{ A - 2 } 


H'15 


■ : DECR2-B, 


{ B - 2 } 


H'16 


■ : INCR2-A, 


{ A + 2 } 


H'17 


' : INCR2-B, 


{ B + 2 } 


H'18 


' : DECR4-A, 


{ A - 4 } 


H'19 


' : DECR4-B, 


{ B - 4 } 


H'lA 


' : INCR4-A, 


{ A + 4 } 


H'lB 


' : INCR4-B, 


{ B + 4 } 


H'lC 


'■ : LDSTAT-A, 


{ Load A into status } 


H'lD 


' : LDSTAT-B, 


; Load B into status } 


H'lE 


' : undef inedl. 


RESERVED } 


H'lF 


' : undef ined2. 


: RESERVED } 


H'20 


•: DNl-OF-A, 


: A » 1, zero fill } 


H'21 


' : DNl-OF-B, 


[ B » 1, zero fill } 


H'22 


■ : DNl-OF-AQ, 


[ AQ » 1, zero fill } 
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H'23 


: DNl-OF-BQ, { 


BQ » 1, zero fill ) 


H'24 


: DNl-lF-A, { 


A » 1, one fill } 


H'25 


: DNl-lF-B, 1 


B » 1, one fill 1 


H'26 


: DNl-lF-AQ, { 


AQ » 1, one fill ) 


H'27 


: DNl-lF-BQ, 1 


BQ » 1, one fill } 


H'28 


: DNl-LF-A, I 


A » 1, link fill } 


H'29 


: DNl-LF-B, \ 


B » 1, link fill ) 


H'2A' 


: DNl-LF-AQ, 1 


AQ » 1, linkfill } 


H'2B' 


: DNl-LF-BQ, ( 


BQ » 1, linkfill ) 


H'2C' 


: DNl-AR-A, { 


A » 1, sign fill } 


H'2D' 


: DNl-AR-B, { 


B » 1, sign fill } 


H'2E' 


: DNl-AR-AQ, { 


AQ » 1, sign fill } 


H'2F' 


: DNl-AR-BQ, 


BQ » 1, sign fill } 


H'30 


: UPl-OF-A, { 


A « 1, zero fill } 


H'31 


: UPl-OF-B, { 


B « 1, zero fill } 


H'32 


: UPl-OF-AQ, \ 


AQ « 1, zero fill } 


H'33 


: UPl-OF-BQ, { 


BQ « 1, zero fill ) 


H'34 


: UPl-lF-A, { 


A « 1, one fill 1 


H'35 


: UPl-lF-B, 


B « 1, one fill } 


H'36 


: UPl-lF-AQ, 


AQ « 1, one fill 1 


H'37 


: UPl-lP-BQ, i 


BQ « 1, one fill } 


H'38 


: UPl-LF-A, 


A « 1, link fill } 


H'39 


': UPl-LF-B, 1 


B « 1, link fill } 


H'3A 


: UPl-LF-AQ, 


AQ « 1, link fill ) 


H'3B 


:, UPX-LF-BQ, 


BQ « 1, link fill } 


H'3C 


: ZERO, 


{ Zeros to Y } 


H'3D 


: SIGN, 


{ -1 to Y if N == 1 } 


H'3E 


: OR, 


; A or B } 


H'3F 


: XOR, 


', A exclusive or B } 


HMO 


•: AND, 


: A and B ) 


H'41 


' : XNOR, 


; A exclusive nor B } 


H'42 


': ADD, 


1 A + B ) 


H'43 


• : ADDC, 


: A + B + carry } 


H'44 


•: SUB, 


{ A - B ) 


H'45 


' : SUBR, 


{ B - A ) 


H'46 


• : SUBC, 


[ A - B - carry } 


H'47 


■ : SUBRC, 


[ B - A - carry ) 


H'48 


■ : SUM-CORR-A, 


[ Correct BCD A for add } 


H'49 


' : SUM-CORR-B, 


{ Correct BCD B for add } 


H'4A 


' : DIFF-CORR-A 


[ Correct BCD A for sub } 


H'4B 


' : DIFF-CORR-B, 


[ Correct BCD B for sub } 


H'4E 


' : SDIVFIRST, 


[ First step signed } 


H'4F 


' : UDIVFIRST, 


[ First step unsigned } 


H'50 


• : SDIVSTEP, 


{ Iter step signed } 


H'51 


■ : SDIVLASTl, 


[ Last step signed / + } 


H'52 


• : MPDIVSTEPl, 


[ First step multi / } 


H'53 


■ : MPSDIVSTEP3, 


[ Last step multi signedj 


H'54 


• : UDIVSTEP, 


[ Iter step unsigned / } 


H'55 


• : UDIVLAST, 


i Last step unsigned / } 


H'56 


• : MPDIVSTEP2, 


[ Iter step multi / } 


H'57 


' : MPUDIVSTEP3, 


; Last step multi uns 1 


H'58 


• : REMCORR, 


; Correct rem after / ) 


H'59 


■ : QUOCORR, 


! Correct quo after / } 


H'5A 


: SDIVLAST2, 


; Last step signed / - } 


H'5B 


: UMULFIRST, 


[ First step unsigned * \ 


H'5C 


: UMULSTEP, 


[ Iter step unsigned * ) 


H'5D 


: UMULLAST, 


'. Last step unsigned * ) 


H'5E 


: SMULSTEP, 


[ Iter step signed * } 


H'5F 


: SMULFIRST) , 


; First step signed * } 


default (AI 


3D) ; 





end; 
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1 : begin 

pos_src: length (1), 

values (0 : pins, 
default (pins) ; 



{ Source for position } 



reg) , 





wid_src 


length (1) 


r 


{ Source for width } 






values (0 : 


pins, 1 : reg) 


/ 






default (pins) ; 






Aia29332 


: length (7) 


, { Instruction } 




values 


(H'60' 


: NB-SN-SHA, { 


A « pos, sign fill ) 






H'61 


: NB-SN-SHB, 


B « pos, sign fill } 






H'62 


: NB-OF-SHA, 


A « pos, zero fill ) 






H'63 


: NB-OF-SHB, 


B « pos, zero fill } 






H'64 


: NBROT-A, 


Rotate A up pos bits } 






H'65 


: NBROT-B, 


Rotate B up pos bits } 






H'66 


: EXTBIT-A, 


Extract A<pos> } 






H'67 


: EXTBIT-B, 


Extract B<pos> } 






H'68 


: SETBIT-A, 


A<pos> =1 } 






H'69 


: SETBIT-B, 


B<pos> = 1 } 






H'6A' 


: RSTBIT-A, 


A<pos> =0 } 






H'6B' 


: RSTBIT-B, 


B<pos> = } 






H'6C' 


: SETBIT-STAT, 


{ STAT<pos> = 1 ) 






H'6D 


: RSTBIT-STAT, 


{ STAT<pos> " } 






H'6E 


: NOTF-AL-B, 


; Comp B field } 






H'6F 


: PASSF-AL-B, 


: Pass B, set Z flag } 






H'70 


: NOTF-A, 


; Cotnp A field, unalgnd } 






H'71 


•: NOTF-AL-A, 


; Comp A field, aligned } 






H'72 


•: PASSF-A, 


; Pass A field, unalgnd } 






H'73 


' : PASSF-AL-A, 


[ Pass A field, aligned } 






H'74 


' : ORF-A, 


[ A or B, unaligned } 






H'75 


■ : ORF-AL-A, 


{ A or B, aligned field } 






H'76 


' : XORF-A, 


{ A xor B, unaligned } 






H'77 


' : XORF-AL-A, 


( A xor B, aligned field} 






H'79 


• : ANDF-AL-A, 


{ A and B, aligned field} 






H'78 


' : ANDF-A, 


{ A and B, unaligned } 






H'7A 


' : EXTF-A, 


{ Extract field in A } 






H'7B 


' : EXTF-B, 


{ Extract field in B } 






H'7C 


' : EXTF-AB, 


{ Extract field in AB } 






H'7D 


' : EXTF-BA, 


{ Extract field in BA } 






H'7E 


' : EXTBIT-STAT, 


{ Extract STAT<pos> } 






H'7F 


' : PASS-MASK) ; 


{ Generate mask pattern } 




end; 










endcase ; 








borrow: 


length 
default 


(1), 
• (0); 




{ Borrow mode } 


hold: 


length 

default 


(1), 
(0); 




{ Hold status & Q} 
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*********************************************************************** 

Macros for MCASM (Microtec Assembler) 
Macros for Am29332 32-bit ALU 

*********************************************************************** 
*********************************************************************** 

datasize — set data size for subsequent operations 

*********************************************************************** 
macro datasize Ssz; 

global Sdsize; 

begin 

Sdsize = Ssz; 

end 

*********************************************************************** 

ALU — set alu operation with fixed data size 

*********************************************************************** 
macro ALU Sop; 

global Sdsize; 

begin 

output ("b_width = sdsize, Am29332 = Sop") ; 

end 

*********************************************************************** 

preg — set position source to register 

*********************************************************************** 
macro preg ; 
begin 

output (^'pos_src = reg"),- 
end 

*********************************************************************** 

wreg — set width source to register 

*********************************************************************** 
macro wreg ; 
begin 

output ("wid_src = reg") ; 
end 

*********************************************************************** 

ALUv — set alu operation for variable data size 

*********************************************************************** 
macro ALUv Sop Spos Swidth ; 
begin 

output ("position = Spos, width = Swidth, Am29332 - Sop") ; 
end 
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/* */ 

/* MetaStep (Step Assembler) */ 

/* Definitions for Ain2 9325 32-bit Floating Point Processor */ 

/* */ 

enr: length (1), /* Load Register A */ 

values (0 : LOAD , 1 : NOP) , 
default (NOP) ; 

ens: length (1), /* Load Register S */ 

values (0 : LOAD , 1 : NOP), 
default (NOP) ; 



enf : 



length (1), /* Load Register F 

values (0 : LOAD , 1 : NOP), 
default (NOP) ; 



V 



R Select; 



length (1), /* R Source Select */ 

values (0 : BUS , 1 : F-Reg) , 
default (BUS) ; 



S Select: 



length (1), /* S Source Select */ 

values (0 : S-Reg , 1 : F-Reg), 
default (S-Reg) ; 



Ain29325: 



length (3) , /* FPU Instruction 

values ( : PLUS, /* F = R + S */ 

1 : MINUS, /* F = R - S */ 

2 : MUL, /* F = R * S */ 

3 : 2MINUS, /* F = 2 - S */ 

4 : FLOAT, /* F = float R */ 

5 : INT, /* F = int R */ 

6 : DEC, /* F = dec R */ 

7 : IEEE, /* F = ieee R */ 
default (0); 



*/ 



round : 



length (2), 

values (0 : NEAREST, 

1 : DOWN, 

2 : UP, 

3 : ZERO, 
default (NEAREST) ; 



/* Rounding Mode 
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/* */ 

/* Macros for MetaStep (Step Assembler) */ 

/* Macros for Ain29325 32-bit Floating Point Processor */ 

/* */ 

/************************************************************************/ 

/* */ 

/* Load R Register */ 

/* */ 

macro loadr ssrc; 
begin 

R_select = &src, enr = LOAD 
endm; 

/* */ 

/* Load S Register */ 

/* */ 

z************************************************************************^ 

macro loads ; 
begin 

ens = LOAD 
endm; 

/* */ 

/* Load F Register */ 

/* */ 

/************************************************************************z 

macro loadf ; 
begin 

enf = LOAD 
endm; 

/**********A**A**********************************************************Z 

/* */ 

/* Do all 1 operand FPU operations */ 

/* */ 

/*****************A***********************«******************************/ 

macro fpu Sop &s ; 
begin 

Am29325 = Sop, S_select = ss 
endm; 

/****************************A*******************************************/ 

/* */ 

/* Do all operand FPU operations */ 

/* */ 

macro fcvrt Sop ; 
begin 

Am2 9325 = Sop 
endm; 
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/* */ 

/* MetaStep (Step Assembler) */ 

/* Definitions for Am29334 Four-Port Register File */ 

/* */ 



Wrt enable A: 



OEA: 



length (4), 




values (H'O' : 


double. 


H'8' 


3byte, 


H'3' 


high-word, 


H'C 


low-word. 


H'7' 


byte 3, 


H'B' 


byte2. 


H'D' 


bytel. 


H'E' 


byteO, 


H'F' 


none) , 


default (none) 


; 



length (1), 

values (0 : enable, 

1 : disable) , 
default (disable) ; 



A-write : 


length 


(6); 


A- re ad: 


length 


(6); 


Wrt enable B: 


length 


(4), 




values 


(H'O- : double, 
H'8' : 3byte, 
H'3' : high-word, 
H'C : low-word, 
H'7' : byte3, 
H'B' : byte2, 
H'D' : bytel, 
H'E' : byteO, 
H'F' : none). 




default 


(none) ; 


OEB: 


length 


(1), 




values 


(0 : enable, 
1 : disable). 




default 


(disable) ; 



B-write : length ( 6 ) ; 
B-read: length (6); 



/* Write enable for port A */ 



/* Port A output enable */ 

/* A write address */ 

/* A read address */ 

/* Write enable for port B */ 



/* Port B output enable 

/* B write address 
/* B read address 



*/ 
*/ 
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/************************************************************************/ 

/* */ 

/* MACROS for MetaStep (Step Assembler) */ 

/* Macros for Am29334 Four-Port Register */ 

/* */ 

/************************************************************************/ 

/************************************************************************/ 

/* */ 

/* SrcA — select A register source */ 

/* */ 

macro SrcA &n ; 
begin 

A-read - Sn, OEA = enable 
endm; 

/* */ 

/* SrcB — select B register source */ 

/* */ 

macro SrcB Sn ; 
begin 

B-read = Sn, OEB = enable 
endm; 

/* */ 

/* DestA — select A register destination and size */ 

/* */ 

macro DestA &n Ssize; 
begin 

A-write = Sn, Wrt_enable_A = Ssize 
endm; 

/************************************************A***********************/ 

/* */ 

/* DestB — select B register destination and size */ 

/* */ 

macro DestB Sn Ssize; 
begin 

B-write = Sn, Krt_enable_B = Ssize 
endm; 
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5.4 MICROCODE DEVELOPMENT 

5.4.1 Step Engineering 
32-Bit Development Tools 

Step Engineering offers an integrated set of powerful 
development tools for the design and development of 
microprogram-based systems. In particular, these devel- 
opment tools are well suited for use with 32-bit building 
block devices such as the Am29300 family of compo- 
nents from AMD. 

Forthe 32-bit system designer, the MetaStep Language 
System provides a powerful and flexible language defini- 
tion, design, and development system forthe develop- 
ment of customized microinstructions and micropro- 
grams. An important feature of the language is the ability 
to support both high order language constructs and bit- 
vector level operations. In addition, comprehensive 
source level debug facilities are inherent in the language, 
with a linl< to the STEP-40 SDT hardware debug stations. 

The STEP-40 SDT is Step's system-level development 
tool for Am29300 32-bit microprogram-based design. It 
offers a comprehensive array of hardware tools and user 
interface software that supports every level of the devel- 
opment task. 

The MetaStep Language System 

The MetaStep Language System from Step Engineering 
is a powerful new microprogramming tool for the pro- 
grammer/designer who wishes to utilize microprogram- 
based devices such as the Am29300family as well as the 
Am2901 , the Am291 0, the Am291 1 6, and many other bit- 
slice or microprogrammable units. MetaStep is a full- 
featured and well-structured microprogram meta-as- 
sembler with advanced features that give the program- 
mer great power and flexibility. Both an elegant high 
order and a powerful bit-level language system, 
MetaStep includes five interrelated language modules 
and an AMDASM-to-MetaStep translator program. 

A unique feature of the MetaStep Language is the 
MetaStep QuickLearn Environment. This integral envi- 
ronment expedites the development and debug of micro- 
programs by providing a menu driven, interactive pro- 
gram that gives the user instant access to a user- 
selected editor, a file display program, a directory listing, 
an automated definition file generator and the MetaStep 
assembler. This program lets the user easily generate a 
definition file, assemble a program, quickly move from an 
assembly error directly to the line in his source code 
that contains the error, correct that error and return to 
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assembly. With single keystrokes the user can select 
from a variety of options and move quickly from one 
programming environment to another. 

These features can greatly increase the speed and 
accuracy of definition file and microprogram generation 
by eliminating much of the tedious, time-consuming and 
error-prone task of catching and correcting syntactical 
errors. 

Unlike earlier, more primitive microprogram assemblers, 
the MetaStep language system provides both high level 
and low level programming constructs for the designer/ 
programmer. For the hardware designer/debugger, 
MetaStep supports any "close to the hardware" program- 
ming style with total control of bit level field constructs. 
This is termed bit vector level coding. MetaStep is also 
the ONLY microprogram meta-assemblerto supporttaie 
source level debug when linked to a STEP-40 SDT 
system. 

MetaStep supports a full range of macro instruction 
features that let the programmer easily and quickly take 
full advantage of the power inherent in devices such as 
the Am29332 ALU, the Am29331 Sequencer, the 
Am29334 Register File, the Am29C323 Multiplier and 
Am29325 Floating Point Processor. 

This flexible language provides the ability to create 
complex high level language constructs specifically tai- 
lored to your application. These constructs can be of any 
complexity, up to and including those of a custom lan- 
guage compiler. Of particular interest is the ability to 
intersperse bit-level instructions freely among high order 
constructs. This allows performance-critical code to be 
hand-crafted and placed within high order assembly or 
even high level language statements. 

Design rule constraint management, error checking, 
data field validation, user-defined warning messages, 
and automatic pipeline compensation mechanisms pro- 
vide a rich, defensive programming environment that 
permits error detection at assembly time, rather than at 
debug or runtime. 

MetaStep features include a free-form and position- 
independent syntax, informative listings of macro expan- 
sions, field assignments, default assignments, symbol 
cross references, and symbol table listings, automatic 
hardware-to-software bit position mapping, field check- 
ing facilities, pipeline delay facilities, constraint manage- 
ment, consumption of AMDASM code, 28 expression 
operators, close interface to runtime debug facilities, and 
generation of files that give runtime information in sym- 
bolic form. MetaStep also supports meta-disassembly. 



Reprinted with permission from Step Engineering, 



Inc. 
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MetaStep is presently distributed for use on five different 
types of systems: CPM/68K-based systems, MS/DOS- 
based systems, VAX/UNIX-based systems, VAX/VMS- 
based systems, and SUN UNIX-based workstations. 
Support for other operating systems will be added In tlie 
future. 

The five MetaStep language modules are called the 
Definition Processor, the Assembler Processor, the 
Linl<er Processor, the Format Processor and the UDS or 
User-Defined Symbolics Processor. 

The Definition processor is used to define a language for 
a given target architecture, field by field, with logical 
groupings where appropriate. The definition processor 
defines constraints over fields, groups of fields, and 
entire instructions. Included in the definition processor is 
the abilKy to define macroinstructions, constants, and 
variables only once, and to then make those values 
available to the entire language system. 

The Assembler processor is a macro-driven, relocating 
and constraint maintaining mtoroprogram assembler. It 
produces relocatable object modules, error, warning, 
and user-defined messages, and symbolic output for use 
by the linker and system debuggers. 

The Linker processor generates absolute code as well as 
debug, symbol and structure tables from definition proc- 
essor and assembler processor output files. 

The Formatter processor takes the absolute object file 
output of the linker and extracts several different types of 
information. These include a binary output file toadable 
into a STEP-40 SOT development tool, a hexadecimal 
output file, a symbol file with user program global labels 
and addresses, and a debug file for on-line assembly/ 
disassembly and source level debug. 

The User-Defined-Symbollcs processor automatically 
generates User-Defined-Symbolics or UDS files. This 
frees the debug engineer who wishes to perform debug 
functions at the source level from the task of redefining 
the symbolics of the language every time he does a re- 
assembly. 

The AM DASM-to-MetaStep translator offers the ability to 
take current AMDASM assembly source code and auto- 
matically translate that source into a syntactic fonri that 
is accepted by the MetaStep assembler. 

MetaStep can be configured to execute in two environ- 
ments: the station model. Intended for use on a STEP-40 
SDT development station; and the no-station model, 
intended for use in environments that do not use the 
STEP development stations or MetaStep language sys- 
tem debug and symbol files. 



Some of the more important features of MetaStep are: 

Free-fonn, non-positional keyword syntax 

Powerful macro facility 

Symbolic field names 

Data types such as strings, integer, and 
enumeration 

If and for assembler directives 

Case statements 

Recursive expresston facility 

Attribute operators 

Modular programming support 

Design rule management 

Automatic pipeline delay compensation 

Relocatable object code 

Any order bit-to-field assignments 

Link to true source level debug 

Easy integratton to hardware debug station 

Consumes AMDASM source code 

Fast (10,000 fields/minute) one-pass operation 

MetaStep solves the problems associated with older 
positional microprogram assemblers, i.e., the difficulties 
in keeping track of fields and field values by rote and 
precise positioning, the lack of any value or error check- 
ing mechanisms, the lack of a link to a hardware debug 
system at the symbolic level, and the lack of any means 
of reconstructing backwards from the microword to the bit 
fields that comprise H. 

MetaStep provides the non-positk>nal capability to define 
fields in togical order rather than simply by microcode 
instnjction address, and includes support for nested 
macros, case structures and keyword parameters. The 
following is an illustration of a partial MetaStep program. 
As can be seen below, MetaStep has the ability to 
support both bit vector and high level coding techniques. 
The upper program segment illustrates a field by field 
programming style that uniquely declares each pertinent 
field in the microinstmctbn word. The lower segment 
shows a second MetaStep example that uses only high 
level statements to perform the same operation! As can 
be imagined, utilizing high level language constructs 
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greatly eases the programming task. For convenience 
and power, the programmer can intermix low level and 
high level program statements and/or start his program- 
ming task with simplistic statements and then grow into 
more complex usages as his experience grows. 

Two illustrative MetaSlep program statements: 

Should the programmer/designer wish to program at the 
bit vector level, a simple MetaStep bit vector level pro- 
gram could be written like: 



0P116 = TORAA, 

SRCDST = OR, REG = Rl, 

CTLYEN = YEN_L, CCMUX = Tl, 

2910INST = CONT, TCONTROL = Nl, 

JMPADR = WALK, DLE = DLE_H, OET = OET_H, 

SRE = SRE_L, lEN = IEN_L, 

OEY = OEY L 



A comparable MetaStep partial program using High 
Order Language or HOL constructs would look like this: 



ACC <- ACC OR Rl 



While the previous example illustrates the simplicity of 
using MetaStep, the microprogrammer may very well 
be more concerned with power and flexibility. Devices 
like the Am29332 are complex devices with powerful in- 
struction sets. To best take advantage of their power, 
MetaStep can incorporate all of the possible configura- 
tions of an Am29332 instruction into one clear MetaStep 
instruction. 

For example, there are numerous opttons available to the 
programmer on each Am29332 instruction. Fixed length 
and variable length instructions such as MOVEs, 
SHIFTS, ADDS. SUBTRACTS, MULTIPLY/DIVIDEs, of- 
fer several different source and destination locations 
depending upon the class of instruction. With MetaStep, 
a programmer need define each Am29332 instruction 
only once, using high level constructs such as the CASE 
directive to define all of the possible configurations of the 
instruction. Then throughout his program, he can utilize 
that definition with a simple high order instruction mne- 
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monic that takes into account all of the various complica- 
ttans associated with that instruction and data and source 
combinations. 

In addition, he can prevent microprogramming errors by 
providing error checking conditions within the instruction 
definition, so that illegal conditions are flagged at the 
assembly level, not at the debug level. 

In this way, the programmer can reduce a large and 
complex instruction set to a few easy to remember 
mnemonics. This frees the programmer to concentrate 
on the logic of his program. In this way, microprogram- 
mers can quickly apply all of the power of the Am29300 
family to his design. 

MetaStep system components share a common data- 
base and utilize common control constructs. The defini- 
tion processor provides the capability to define variables, 
a string facility that allows concatenation, and it supports 
coheston operations as well as 28 expression operators. 
The definition processor's ability to nest macros, pass 
variables through macro expansions, and perform recur- 
sion makes K a powerful facility for creating custom 
languages. 

Constraint management facilities include a check de- 
scriptorthat may be utilized to test constraints on a single 
field, a case branch, an entire microinstruction, or be- 
tween microinstmctions. Most importantly, rules of the 
target architecture may be embedded in the language 
facilities to detect bugs at assembly time rather than 
debug time. This facility allows user-defined procedural- 
based design rules to be enforced. 

With MetaStep, memory space controls allow code to be 
generated for not only multiple segments, but multiple 
memory segments. This allows a single program to 
generate code for modem architecture class machines 
such as Harvard class machines and data flow architec- 
tures that typically contain multiple program stores. 

A significant advantage offered by MetaStep is that the 
database files generated from the definition, assembler 
and linker are common and provide a method to pass all 
language constructs to debug tools such as the STEP-40 
SDT. This means that the STEP-40 development tools 
can now have the capability to use the language defini- 
tion files and all symbol tables to create taie meta- 
dlsassembly. Powerful source level debug can greatly 
speed the development of any microprogram design 
and. In partcular, as microprogram-based systems 
increase in complexity, true source level debug is a 
necessity. 
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MetaStep Quick Reference 

MetaStep System Overview 

• Common system elements shared between 
MetaStep processors 

• Five Processors 

- Definition processor 

- Assembler processor 

- Linker processor 

- Format processor 

- User defined symbolics processor 

• AMDASMto MetaStep translator program 

• COMMON ELEMENTS: All processors share data 
files and common structures. 

- Common syntax and semantics: include forms of 
names, constants, directives and legal and 
illegal value definitions. 

- Common directives include: 

• Source Control Directives, 

- Listings - forms control, summary information, 

- Include - source inclusion, 

- Format - listing headings, trailers, and 
control, 

• Flov\/-of-Control Directives 

- If - fully nested conditional control 

- For - repetitive conditional control statement 

• Macro facilities, including nested macro capa- 
bility and parameter passing and expansion. 

- Specification of assembly time constructs, 

- Shorthand specification of logical groupings 
of assignments. 

- Generation of warning and error messages. 

• DEFINITION PROCESSOR: accepts a definition 
of the target system architecture and develop- 
ment environment. 

- Micro-architecture description: by means of in 
struction/field formats. 

- Instruction directive: names the architecture 
and specifies Instruction length. Maximum in 
struction length is 1 024 bits. 

- Field Description: defines a field as a group of 
bits {not necessarily contiguous) that perform a 
common function. Each field must be given a 
field description. 

A full set of field descriptors is as follows: 

• bits - define absolute bit locations of field in 
microinstruction 

• check - constraint check on assignment to this 
field 



• complement - two's complement field value 

• default - provide value when field is not as- 
signed 

• display - provide debugger and default radix 
information 

• invert - one's complement field value 

• length - specify length of field 

• mask - truncate values to field length 

• parity - this field is the parity field 

• reverse - reverse bits in field 

• valid - specify legal values for field 

• values - specify symbolic values for field 

VALUES, VALID, AND CHECK provide syntactic, 
semantic, and pragmatic verifications on a perfield 
basis. 

VALUES provide syntactic information indicating 
what are acceptable values for assignment to a 
field. 

VALID provides semantic information, listing all 
the acceptable values for the field. 

CHECK provides a way of examining assigned val- 
ues in the context of otherfield values or other state 
information. 

The Case Definition: alternative field interpreta- 
tions. A case definition can be specified for each 
field. It is a powerful mechanism for defining alter- 
native bit values for overlapping fields. 
The Environment Description: allows the program- 
mer to specify the development environment, with 
constraints on field values, sequences of microin- 
structions, and the relationship between field 
values. 

Features include: 

• bitMap 

• macros 

• EQU symtx)ls 

• variables 

- Constraints are provided in three general ways: 

• Symbolic values 

• Case branch constraints 

• Check descriptors - The check descriptor asso- 
ciates a constraint macro with one of the follow- 
ing: 

- a single field 

- a case branch 

- the entire microinstruction 

- Validations: numerous checks performed at defi 
nition time verify that field names and values in 
case branches are consistent. 
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• THE METASTEP ASSEMBLER: supports coding 
styles ranging from bit vector specification througti 
high order language expression and each stage in 
between. Allows mixing of bit vector and HOL ex- 
pressions during coding. 

- Instructions: a series of comma-separated 
phrases. A phrase may be a field assignment, a 
macro-invocation, or a flow-of-control directive. 

- Field Assignments:consistsof field name, followed 
by an equal sign, followed by an expression. 

- Macro Phrases: a macro-invocation is a macro 
name, optionally followed by parameters. Macros 
may be nested. 

- Relocation Facilities 

• org 

• align 

• reserve 

• segment 

• entry 

• point 

■ external 

• METASTEP LINKER: combinesallsystem elements 
into absolute code that can be loaded into ROMs or 
simulators. It also produces debug tables. 

- Directives: 

• load 

• name 

• locate 

• reserve 

• fill 

• mapPoint 

• analyze 

• set 

• parity 

• AMDASM TO METASTEP TRANSLATOR: pro- 
duces MetaStep source statements from AMDASM 
source statements. 

The Step-40 SDT 

The STEP-40 SDT is the premier hardware-based devel- 
opment tool for any microprogram development task. In 
particular, it offers a comprehensive system for the 
design and debug of Am29300-based systems. It offers 
in one integrated chassis all of the development and 
debug tools needed for such an effort. With high reliability 
cabling and interconnect technology, the hardware 



chassis permits the plug-in addition of a wide range of 
distinct but interrelated hardware tools. An IBM-PC/AT 
computer system provides the human interface, mass 
storage, and I/O devices. 

Key Features of the STEP-40 SDT: 

• Fully supports 32-bit Am29300-based system devel- 
opment and debug. 

■ Supports other microprogrammed products such as 
bit-slice, ASIC, DSP, or VLSI. 

• Completely integrated hardware/software develop- 
ment station. 

• Powerful IBM-PC/AT-based microprogram support 
instrument. 

• Supports MetaStep, the first true high level language 
for microprogram development with in-line bit vector 
level support. 

• SOURCE LEVEL DEBUG available at all levels of 
hardware and software debug. 

• Reconfigurable, ultra-reliable 1 to 70 ns writable 
control store supports up to 64K x 51 2-bit arrays. 

• Real-time emulators for popular bit-slice AMD ALUs 
and sequencers. 

• Logic state analysis with trace memory and sophisti- 
cated multi-level control. 

• Performance analysis tools like histograms, timing 
analysis, access tracking and predicate analysis. 

• Regression Test tools for design validation. 

• Meta-Disassembly coupled with source edit, source 
management, version control, and on-line patch 
management. 

• User-Defined Symbolics allows conditional disas- 
sembly of trace or any system data. 

• Sophisticated, easy-to-use screen-oriented editor 
with pop-up help menus. 

HARDWARE resources include writable control store 
modules with the widest range of speeds and widths; 
real-time emulators for popular bit-slice parts such as the 



5-27 



CHAPTER 5 
Support Tools 

Am2910, and Am29116; logic state analysis trace 
memory modules with flexible clock and breakpoint 
control modules; a histogram/timing analysis module for 
performance analysis tasks; and high speed memory 
simulation modules for more than 450 popular ROMs, 
RAMs, and PROMs. With a powerful high speed bus and 
modular hardware design, the STEP-40 SDT presents 
no hardware limitations for designers utilizing the most 
advanced microprogrammed devices. 

SOFTWARE tools include a sophisticated, easy-to-use, 
screen-oriented editor; a powerful turtxi programmers 
environment for fast, error free program development 
and debug; MetaStep for superior high level and bit- 
vector level programming; User-Defined Symbolics for 
comprehensive on-line symbolic debug; Meta-Disas- 
sembly for true interactive symbolic debug with full 
access to MetaStep symbol tables; and performance 
analysis tools like histogram and time stamping, 
regression testing and automated test suite generation 
tools. The STEP-40 SDT is the first system to offer 
source level debug throughout the development and 
debug environment. 

Because the STEP-40 SDT is an IBM-PC/AT based 
development station, it gives you the best of both worlds: 
a wide range of comprehensive hardware debug re- 
sources coupled with a fast, convenient and well-sup- 
ported computer system. The IBM AT, in particular, offers 
the widest range of software support of any lab-based 
system in the industry. The IBM-PC/AT workstations 
have the power to match the STEP-40 SDT debug 
station. As intelligent hosts they can support advanced 
user interfaces and control the multiple hardware re- 
sources. In addition, system updates and new features 
can be added quickly thanks to the flexibility inherent in 
these standard workstations. As hardware needs 
change, the user need only add hardware modules to the 
STEP-40 SDT specialized hardware chassis. 

Hardware Tools 

Plug-in writable control store modules are available with 
flexible array conf igu rations from 1 K x 64 to 1 6K x 1 28 per 
module. Modules can be mapped into arrays of up to 64K 
x 51 2 bits in size. Access times vary from 70 ns to 10 ns 
(and even faster when RAM technology permits). 

The Writable Control Store (WCS) is a dual-port memory 
accessible from either the STEP-40 SDT or the target 
system. Both ECL and TTL RAM are supported with the 
industry's most comprehensive array of memory emula- 
tion. Having upto 16K x 1 28 bits on a single WCS versus 
having many small boards connected with many cables, 
dramatically improves reliability and signal integrity. The 



user can configure to meet his design objective without 
sacrificing reliability or performance. Further, the STEP- 
40 SDT can support up to 32 independent arrays con- 
trolled by either a single or multiple clocks. 

Available Modules: 

• WCS-64 is the fastest STEP WCS. It uses 1 ns ECL 

RAMs and connects to the target via address and 
data pods containing ECL to TTL translators. Organ- 
ized by 1 K X 64 or 2K x 32 bits. 

• WCS-128 provides twice the density of the WCS-64 
with 1 ns ECL RAMs. Organized in 2K x 64, 4K x 32, 
orSKx 16 bits. 

• WCS-256 and WSC-1 024 provide even larger memo- 

ries for applications with less demanding speed re- 
quirements. WCS-256 is configured as 4K x 64, 8K 
X 32, or 1 6K X 1 6. WCS-1 024 is configured as 1 6K x 
64, 32K X 64, or 64K x 16 bits. Interface circuitry 
matches exact user memory specifications. 

LOGIC STATE ANALYSIS (LSA) - provides trace mem- 
ory modules with sophisticated clock, breakpoint and 
trace control. With true conditional bit-mapped disas- 
sembler (User Defined Symbolics or UDS), the LSA 
provides real-time 3-way branching using a 54-bit match- 
word to trigger the 25 MHz or 50 MHz trace memory. 
Linkage is provided to the symbol table of the user's 
source code for access to symbolic debug information. 
Source code can be interleaved with trace samples for 
easy cause (microinstruction) and effect (traced sample) 
readability and comparison. 

TRACE MEMORY is provided with either 4K (TM-256) 
or 16K (TM-1024) bits of real-time trace memory at 
speeds of 16 MHz, 25 MHz or 50 MHz. These memories 
act as a circular buffer storing the last 4K or 16K store 
samples. Store clock filtering extends the effective buffer 
depth substantially by filtering out unwanted samples. 
Triggering and sampling is controlled by the trace 
control module. 

TRACE CONTROL modules include the sophisticated 
clock and breakpoint controls. With a screen editor 
display, the user can set up to five 54-bit (1 6 address, 32 
data, and 6 external qualifiers) matchwords per level to 
qualify trace memory sampling. Up to 16 independent 
levels for trace triggering or breakpoint are possible, with 
each level allowing for three way branching on an IF, 
ELSE-IF, ELSE-IF basis. A delay countercan be used on 
each IF branch to count occurrences of the 54-bit match- 
word or store cycles. 
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IN-CIRCUIT EMULATORS permit real-time emulation of 
popular bit-slice circuits such as the Am2910, Am29l 1 6 
and otherpopular devices. The usercan directly observe 
the internal states of these chips as they execute his 
program. The usercan examine and modify registers and 
stacks. Execution control includes single step, multiple 
step and run program commands. Multiple emulators 
can be simultaneously controlled from a single emulator 
control module. STEP in-circuit emulators will operate in 
real-time at the full rated speed of the emulated circuit. 

MEMORY EMULATOR modules support awide range of 
RAM, ROM and PROM devices. Over 450 popular 
memory devices can be emulated. 

PERFORMANCE ANALYSIS modules provide the hard- 
ware support for software features like histogram and 
time stamping. Time analysis can be performed with 1 2.5 
ns resolution. Histograms can be in absolute time or in 
microcycles for precise execution measurements. A 48- 
bit timer/counter permits continuous analysis over hours 
and days, not just seconds. 

Software Tools 

The STEP-40 SDT fully supports METASTEP, thus 
providing the world's first truly high level microcode 
development language in a fully integrated development 
station. 

METASTEP QUICKLEARN PROGRAMMING ENVI- 
RONMENT is a unique facility that speeds the develop- 
ment of MetaStep programs. The usercan quickly switch 
from facility to facility without losing his place in his code. 
This is particularly useful during program debug and 
patch. 

SOURCE LEVEL DEBUG is another unique capability of 
the STEP-40 SDT. With the MetaStep language as the 
foundation, a microcode-based project can be greatly 
speeded by utilizing symbolic information throughout the 
debug cycle. A truly interactive symbolic debug capabil- 
ity, source level debug permits on-line meta-assembly, 
meta-disassefribly on-line, run-time editing at the source 
level, and directly readable displays. 

All STEP-40 SDT commands can reference symbolic 
labels defined in MetaStep. Thus, the user need enter 
and define his labels only once. Later he can use them 
throughout his debug tasks without reentering or redefin- 
ing them. This is a requirement for convenient debug of 
relocatable microcode. Other systems require that the 
user spend endless hours defining his symbolic informa- 
tion each time he reassembles his code. Source Level 
Debug also means that he can control his hardware 
debug resources using this symbolic capability. 
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User Defined Symbolics (UDS) provides complete dis- 
play and control of microcode, trace data and emulator 
data. Any arbitrary digital word can be conditionally 
disassembled into any symbolic representation. Unlike 
older systems that merely allow permutation of some 
fields in groups of contiguous bits, UDS gives the user a 
general purpose bit mapping (binary to symbolics) capa- 
bility unmatched by any other system. UDS has great 
utility in hardware trace situations. 

META-DISASSEMBLER capability allows the source 
definition to be accessed by the debug process and 
provides the user the abilities of disassembling his 
source code in-line, assembling in-line, plus insertion of 
additional microcode. 

PERFORMANCE ANALYSIS capabilities include histo- 
grams and time stamping. 

HISTOGRAMS permit absolute time or microcycle 
analysis of your microcode execution. With a 48-bit 
counter, time analysis can be perfomned over days and 
weeks if necessary, not just seconds. This analysis can 
give you graphical information showing where code 
optimization can best help overall system performance. 

TIME STAMPING includes a 12.5 ns resolution to easily 
measure time between captured system events and 
provides both absolute and relative time stamping in both 
time and microcycles. 

DUALITY ASSURANCE TOOLS aid in reducing overall 
system costs and in rapid test development. These 
include access tracking, predicate analysis and 
MetaStep facilities for maintenance of source and ver- 
sion control. 

REGRESSION TESTS such as AUTOSTEP provide 
the capability to generate, store and reuse system vali- 
dation tests from design definition throughout the life of 
the product. 

Hardware Specifications 

6-Slot Mainframe: 

6-user slots available per chassis 

Expandable backplane 

MetaMachines: 

Up to 32 per mainframe, each with separate data, ad- 
dress and/or clock inputs. 

Writable Control 5tore: 

Total Address Space: 64K deep x 51 2 bits wide. 
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Modules: 
WCS-64 - 
WCS-128 - 
WCS-256 - 
WCS-512- 
WCS-1024 
WCS-2048 



1Kx64/2Kx32, 

10ns or 15ns RAM speed. 

2Kx64/4Kx32/8Kx16, 

15ns or 25ns RAM speed. 

4Kx64/8Kx32/16Kx16, 

25ns or 35ns RAM speed. 

4Kx128/8Kx64, 

10ns, 15ns, and 25ns RAM speed. 

16Kx64/32Kx32/64Kx16 
35ns or 70ns RAM speed. 

16Kx128/32Kx64, 
25ns, 30ns or 70ns RAM speed. 



Simulation Pods: 

ECL to TTU TTL to ECL conversion 
TTL specifications 
Unlimited number of arrays 

Trace Memory: 

Sizes: 4K x 64 bits or 16K x 16 bits 
Number: up to 8 modules per trace controller. 



Clock, Trace and Breakpoint Controller: 

16-level, 54-bit match word, conditional trace and 
break supported. 

Logic Slate Analysis Control: 

1 6-states, comprehensive control through counters, 
timers, conditionals, triggers, and unlimited break- 
points. 

Additional information about MetaStep, the STEP- 
40 SDT and other Step tools for developing 
Am29300-based systems is available upon request 
from Step Engineering. Please contact: 

Step Engineering, Inc. 
661 East Arques Ave. 
P.O. Box 61 166 
Sunnyvale, CA 94088 

(408) 733-7837 
(800)538-1750 
TWX: 910-339-9506 
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mcASM Structured Microcode Assembler 

The mcASM microcode assembler provides software 
support for the Am29300 family. A second generation 
Structured Microcode Assembler, mcASM was the result 
of a joint effort between Advanced Micro Devices and 
Microtec Research. Ten years of bit-slice and microcode 
assembler experience within both companies has been 
combined with the latest software technology to produce 
this advanced implementation of a relocatable microc- 
ode assembler. 

Special support is provided forthe variable formats found 
in the Am29300 family. This support is an additional 
benefit as it provides constraint management for the 
entire microcode word. New features make mcASM 
faster and easierto use than previous microcode assem- 
blers. These features allow the programmer to concen- 
trate on the target system algorithm, thereby achieving a 
more competitive target system. 

mcASM Features 

• Am29300 family mnemonic definitions included 

• Hosted on VMSA/MS and PC/DOS 

• PROM programmer, Microtec, AMD, and STEP 
output formats 

• Relocatable code segments 

• Overiay support 

• Macros with keyword parameters 

• Automatic selection of word format 

• Keyword syntax 

• Local symbols for each fieW 

• Fields defined with non-contiguous or contiguous 
bits 

Description 

As a meta-assembler, mcASM is used to assemble 
source programs targeted for a user defined set of 
hardware. First, a model definition program, mcDEF, is 
used to define the target mnemonics and their con-e- 
sponding bit patterns forthe assembler, mcASM. Then, 
mcASM assembles the user's source program into mi- 
croinstructions forthe target. 

This meta-assembler is optimized for microcode applica- 
tions where very wide word widths (up to 1 024 bits) are 
not uncommon. A library of pre-defined part definitions is 
included with mcASM forthe Am29300 family and other 
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AMD microcode driven products to help the user quickly 
build the hardware definition file. 

Four related programs make up the product: mcDEF, 
mcASM, mcLINK, and mcPROM. 

A model of the target system is defined using the mcDEF 
definition language. The model is then compressed into 
a lookup table by the definition program, mcDEF. 

The model lookup table allows the microcode assembler, 
mcASM, to translate the user's assembly language 
source code into microcode bit patterns that drive the 
target system. Object modules generated by mcASM are 
in a relocatable format. Thus, smaller, more manageable 
source files can be generated. These can be independ- 
ently updated and quickly reassembled. 

Relocatable object modules are linked together with 
mcLINK to form an absolute executable microcode pro- 
gram. The program may include overiayed segments to 
conserve target system memory. Four formats may be 
selected as the mcLINK output format. These include 
mcFMT, AMDASM, Microtec META29, and STEP Engi- 
neenng GENHEX. 

A fourth program, mcPROM, converts the linker output 
into PROM files that can be downloaded into a PROM 
programmer. DATA I/O ASCII format and BNPF format 
are supported. 

Figure 5-4 shows an oven/iew of the mcASM develop- 
ment process and the following sections describe each 
component of the mcASM package. 

mcDEF - Definition Program 

The mcDEF definition program is a table builder that 
converts a model of the target hardware into a compact 
lookup table for later use by the assembler. The model is 
required by the assembler to describe how mnemonic 
names, used by the programmer, are converted into bit 
fields in a microcode word. 

mcDEF accepts an input file that describes the field 
structure of the microcode word. Each field is independ- 
ently described so it can be uniquely referenced by name 
in the assembly source code. The programmer can then 
directly reference any field and assign a value without 
having to put the value in a prescribed position in a source 
statement. 

Each field can also be assigned a default value so that all 
fields do not need to be encoded in each line of source 
code. Mnemonics assigned a value for a field are local to 
that field. The same mnemonic can be assigned a 
different value in another field. A partial example of a 
processor model is shown in Figure 5-5. 



Reprinted with permission from Microtech Research, Inc. 



5-31 



CHAPTER 5 
Support Tools 



( EDITOR ) 



DEFINITION 
SOURCE 



DEFINITION 
LISTING 




mcDEF 



DEFINITION 
LOOKUP TABLE 




RELOCATABLE 
OBJECTS 



LINKER MAP 



STEP Engineering 
Development Sys. 



AMDAmSYS29/10 
Development Sys 



Target System 



PROM 
Files 




ASSEMBLER 
SOURCES 




ASSEMBLER 
LISTINGS 




PROM 

Programmer 



* User Supplied Program 
09372A 5-4 



Figure 5-4. Overview of the Microtec Research mcASM Development Process 
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Sample Microword 


1 Mem i MAR | 


Pos 


1 Width 


1 Am29332 i 


Borrow 


1 Hold 


1 Data 1 



Microword Definition 

Mem: bit(40), 
MAR: bit(38), 



Position: bit(32), 
Width: bit(27), 
Am29332:bit(18), 

Borrow:blt(17), 
Hold: bit(16), 
Data bit(O), 



length (1), 

values (0:read, 1 :write), 

length (2), 

values (O:nop, 

1:load, 

2:enable, 

3:ld-en); 
iength(6), 
length(5), 

values(see file Am29332.def); 

length(l), 

length(l), 

length(16), 



default (read); 



default (0); 
default (31); 



default (0) 
default (0) 
default (0) 



Figure 5-S. Sample Microword Organization 



In some cases fields may overlap, resulting in several 
independent formats being defined for the same bits. 
mcDEF provides a structured case statement ttiat de- 
scribes each of the formats independently: This allows 
very simple selection of the required format within the as- 
sembly source code. Selection may be made by a 



specific bit setting, use of a unique field name, or assign- 
ing a value unique to one of the cases. 

A case statement demonstrating field overlaying is illus- 
trated in Figure 5-6. 



MICROWORD LAYOUT 

< 16-brts — 



MemOtrl 



Addr 



Data 



( case 0, 2-bits MemCtrl, 14-blts Addr ) 
( case 1 , 16-bits of immediate Data ) 



mcDEF DEFINITION 

case of 



endcase; 



begin ( two fields ) 

addr: length(12); ( address field ) 

MemCtrl: length(4); ( memory control field ) 

end 

begin ( or one field ) 

data: bits (16) ( immediate data field ) 

end 



Figure 5-6. A Variable Format and Case Structure Definition 
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Inthe source program, the format is chosen by specifying 
'data', or by specifying 'addr' and 'IVIemCtrl'. Any attenipt 
to select tioth formats will result in an error at assembly 
time. 

mcASM - Assembler Program 

Source microcode is assembled by using mcASM, a 
structured microcode macro assembler that produces 
relocatable object modules as output. mcASM reads the 
source file and the nnodel definition table as input. Each 
statement of source code Is then converted Into one or 
more microcode words as defined by the definition table. 
The output object module format is relocatable, thereby 
allowing separate modules to be linked into a larger 
executable program. 

Microcode instructions are generated by assigning val- 
ues to the fie Ids that were defined in mcDEF. Assignment 
statements are used to assign values (i.e. fieldname = 
value), allowing the fields to be referenced in any order. 
Fields with acceptable default values do not need to be 
encoded. An example, using the model defined above, is 
shown below. 

loop: Am2 9332 = INCR-A 

MAR = enable, Addr = fetch ; 

Several features are demonstrated by this example. 

• A single instruction can be continued on several 
lines without special notation. 

• Field references can be grouped so that they refer 
to a common device or action. Fields with accept- 
able default values (such as Mem = read) do not 
have to be encoded. 

• A reference to the Data field in the microword 
would generate an error because it conflicts with 
the case selection caused by the use of the Addr 
field. 

An extensive macro facility allows the user to simplify the 
coding task by representing a large collection of field 
assignments with a single name and a few parameters. 
Macros also allow several microcode words to be gener- 
ated with a single macro definition. The ability of mcASM 
macros to support assignment statements allows the 
user to define a higher level language that greatly re- 
duces coding errors and coding time. For example, the 
instruction in the example above can be replaced with: 

loop: ALU INCR-A; 

where ALU is the macro name. The macro ALU assigns 
the parameter INCR-A to a variable field and fixes the 
values of the rest of the fields such as MemCtrl and Mem. 
Macros can also test the parameter values or names and 
then conditionally generate one of several outputs. 



mcASM allows the programmer to structure microcode 
source into segments. Labels used within a segment are 
local to that segment allowing the labels to be reused in 
other segments with new values. Individual segments 
and collections of segments (modules) are separately 
assembled so that the whole program does not have to 
be reassembled for each change in source code. 

mcLINK ■ Linking Loader Program 

mcLINK collects the separate segments generated by 
the assembler and combines them into one executable 
program module. In addition, mcLINK supports genera- 
tion of overlays that can be separately loaded into a 
common memory area. 

Four absolute output formats are provided. Standard 
formats supported by mcASM include AMD AMDASM, 
STEP Engineering GENHEX, and Microtec META29. 
These three formats allow mcASM code to be used with 
existing development systems. A fourth fonnat, called 
mcFMT, Includes complete informationfor implementing 
overlays and performing symbolic debugging. 

While the mcLINK program can generate separate over- 
lay files in addition to the root program files in these three 
standard formats, a single file including overlays and 
symbol information is generated when the mcFMToutput 
Is selected. 

mcPROM - PROM Formatter Program 

Microcode Is generally stored In PROMs In target ma- 
chines. mcPROM Is provided to divide the absolute linker 
output Into separate PROM sized files. These files can 
then be downloaded to a PROM programmer through a 
user supplied communication package. 

Program Features 

The Microtec nxASM structured microcode assembler 
system has the following features. 

Definition Program Features 

Microword lengths up to 1024 bits 

Variable formats, with multiple fields, predefined In 
cases statement 

Field definition attributes : 

BIT - a field may start at any microword bit 

LENGTH - total field length (max 1 6-bits) is 
specified 

VALUE - local mnemonics are assigned to 

field values 
VALID - only values in this list can be used 

DEFAULT - the field Is assigned a default value 



5-34 



CHAPTER 5 
Support Tools 



Value modification operators : 

COIVIPLEMENT - uses two's complement of the 
value 

INVERT - inverts all the bits 

MASK - removes high bits to set size 

REVERSE - reverses the bit order 

Definition program directives : 

TITLE - adds text string to top of each 

page 

INSTRUCTION -defines the width of the micro- 
word 

(NO)LIST - (does not generate) generates a 
listing 

(NO)OUTPUT - (does not generate) generates 
definition table 

(NO)XREF - (does not add) adds cross refer- 
ence 

EJECT - advances listing to next page 

END - marks end of definition program 

Assembly Program Features 

Symbolic addressing 
Conditional assembly facility 
Values assigned to field names 
Powerful macro definition commands : 

IVIACRO - specifies macro name and para- 
meters 

BEGIN - marks the start of the macro 

definition 

LOCAL - defines symbols local to this macro 

GLOBAL - defines symbols global to program 

OUTPUT - outputs source code 
IF - processes a statement if variable 

is true 

WARN - Issues text string to output listing 

ERROR - sends text to listi ng, e nds macro 

END - marks end of the macro definition 

Flexible macro reference : 

Parameter may precede macro name 
(PI macro_name P2) 

Positional parameters are assigned values 

Keyword parameters have default values 

Relocatable output with multiple segments ; 

SEGMENT - starts or restarts a user-named 
segment 



ENTRY - lists all entry points to a segment 

EXTERNAL - lists all labels defined outside the 
file 

Assembler directives : 

PROGRAM - names first segment and definition 
file 

EQU - assigns a constant to a name 

GLOBAL - defines variable available to all 
segments 

INCLUDE - adds additional source file inline 

ORG - sets location counter to hew value 

TITLE - adds a text string to each listing 

page 

(NO)L!ST - (does not generate) generates 
listing file 

(NO)OUTPUT - (does not produce) produces 
output file 

(NO)XREF - (does not generate) generates 
cross reference 

EJECT - advances listing to next page 

END - mari<s end of assembly source 

Link Program Features 

Combines Independently assembled relocatable 
object modules 

Resolves external references 

Adjusts relocatable addresses Into absolute ad- 
dresses 

Versatile user commands : 

LINK - loads specified segments from 

specified file 

ORG - changes value of location counter 

ALIGN - starts next segment at an address 
module n 

OVERLAY - starts and names an overlay 

SET - defines external symbols at link time 

TRANSFER- reads commands from another file 

END - marks end of command entry 

Output listing controls : 

Load map - area and overlay name, base ad- 
dresses 

Defined and undefined symbol references 

Optional symbol cross reference 
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Object module output in one of four formats 
Microtec mcFMT with overlays and symbols 
Microtec META29 
STEP Engineering GENHEX 
AMD AMDASM and AmSYS29 

Conversion Utility Features 

• Separates abslute file into PROM size modules 

• Format is DATA I/O ASCII hexadecimal or BNPF 

• Column overlaying 

• Column switching 

• Automatic parity generation 

Minimum Hardware Required 

Any Digital Equipment Corporation VAX System that op- 
erates under VAX/VMS. The software product typically 
requires 450K bytes of diskstorage after installation. 

An IBM PC or compatible system that includes at least 
51 2K bytes of total main memory and one (1 ) megabyte 
of disk storage. Typically the product requires 600K bytes 
of disk space for permanent installation with additional 
disk storage required for temporary files. Size of tempo- 
rary files depends on the volume of user input. 

Prerequisite Software 

For distributions pre-installed for Digital Equipment Cor- 
poration computer systems, the appropriate VAX/VMS 
operating system. 

For distributions pre-installed for IBM PC or compatible 
systems PC-DOS or MS-DOS versions 2.1 and newer 



fects in the current, unaltered release of the Soft- 
ware Product via a newsletter. The newsletter 
provides notice of the availability of corrected code. 

Any updates to this product released by Microtec Re- 
search during this warranty period will be provided to the 
customer on standard distribution media at prices speci- 
fied in the prevailing Standard License Fee List. Non- 
standard media can be supplied upon request for an 
additional fee. 

Service required because of customer use of other than 
the current, unaltered release of the Software Product 
operated in accordance with the Software Product De- 
scription (SPD) will be provided at Microtec Research's 
current rates, terms and conditions. 

Ordering Information 

All binary licensed software, including any subsequent 
updates, is furnished under the licensing provisions of 
Microtec Research's Standard Terms and Conditions of 
Sale. These terms provide, in part, that the software and 
any part thereof may be used on only the single CPU on 
which the software is first installed, and may be copied, 
in whole or in part, (with the proper inclusion of the 
copyright notice and any proprietary notices on the 
software) only for use on this CPU. 

Refer to the Standard License Fee List for further order- 
ing and media information or consult Microtec Research. 

Software Product Service 

Post warranty service for this product is available to 
licensed customers by purchasing a Software Product 
Service Agreement. 



Support Category - Microtec Research Supported f"" Documentation 



During the warranty period, Microtec Research Inc., 
provides the following standard services if the customer 
encounters a problem with the Software Product: 

1 . If Microtec Research determines the problem to be 
a defect in the software product, Microtec Research 
will provide remedial service by telephone if neces- 
sary (1) to apply a temporary correction or make a 
reasonable attempt to develop an emergency by- 
pass if the software Is inoperable, and (2) to assist 
the customer in preparing a Software Performance 
Report (SPR). 

2. If customer diagnosis indicates the problem is 
caused by a defect in the software product, he may 
submit an SPR. Microtec Research will respond to 
problems reported in SPRs that are caused by de- 



Technical reference manuals are included as part of the 
software product. These manuals provide the informa- 
tion needed to use the software product and are written 
to be used in combination with the language reference 
materials provided by the manufacturer of the micropro- 
cessor. Manuals included are: 

• Microtec mcASM User's Guide 

• Microtec mcASM Reference Manual 

• Microtec mcASM Installation Guide 
For additional information contact: 

Microtec Research, Inc. 
3930 Freedom Circle, Suite 101 
Santa Clara, CA. 95054 
(408)733-2919 
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5.4.3 Hilevel Technology, Inc. 
Emulyzer and Hale 

Hilevel's DS3700 Series Emulyzers provide full microc- 
ode development support for Advanced Micro Devices 
Am29300 Series building blocks. The DS3700 combined 
with HALE (an advanced retargetable Macro-Meta As- 
sembler), with software for firmware integration and 
debug, and with a host computer provides a complete 
microcode development system. 

DS Series Emulyzers 

The DS3700 system employs an internal bit-slice archi- 
tecture combined with ECL design to achieve high 
speed, decrease system latency, facilitate product up- 
grades, and implement unique features. The DS3700 
range of features includes: 

• HALE, an Advanced Macro-Meta assembler 

• 10 ns WCS provides 25 ns access times at target 

• 50 MHz logic state analyzer 

• 50 MHz pattern generator 

• Full software support for PC or VAX based 
operation 

• Interactive source code debugging 

• Source presentation of WCS and trace 

• 1 6 level unrestricted triggering 

• Microcode performance analysis 

• User-defined display formats with bit permutation 
for both WCS and logic analyzer data 

• Command language and command file execution 
of system operations 

• Up to 51 2 bit wide WCS and trace 

The DS3700 Emulyzer is available in three different 
configurations to accommodate varied Am29300 devel- 
opment needs: 

1) as an integrated microcode development system 
connecting to an IBM-PC/XT/AT or compatible 

2) as a stand-alone microcode development worksta- 
tion connecting to your host computer. 

3) as an Emulyzer using a VT100 compatible terminal 
providing memory emulation and logic analysis. 

The Emulyzer can be remotely operated from virtually 
any host computer, over either the IEEE-488 or RS232 
standard interfaces. A series of specific computer com- 
mands provides a high degree of Emulyzer control and 
programming flexibility, with provisions for rapid data 
transfer. 



Writable Control Stores 



The Writable Control Store (WCS) portion of the DS3700 
Emulyzer is a high-speed memory which can be written 
to or read from by the DS3700 operator, the development 
workstation, the host computer, andyourtarget machine. 
For RAM emulation, the microprogrammer may read and 
write to the WCS from the target processor. WCS 
memory options with access times of 25 ns at the target 
are ideal for high speed Am29300 operation. 

A choice of fifteen different WCS memory modules are 
available to provide the user with a selection of speeds 
and densities to fill any microprogramming application. 
Memory boards are designed to optimize access times. 
All memory modules are 1 6 bits wide and are available in 
depths of 1 K, 4K, orl 6K. Modules may be configured in 
parallel for widths up to 512 bits. 

The DS3700 Series can support WCS arrays up to 16K 
deep or 512 bits wide. Additionally, the WCS may be 
configured to support multiple arrays with each array 
configured for a unique size and speed. 

Logic Analyzer 

The DS3700 Series Logic Analyzer section is configured 
in 16 bit increments. Each increment may be clocked 
independently, or any number of these can be cfocked 
synchronously. Trigger words may be defined across the 
entire trace width and qualified with AMDs, ORs, comple- 
ment, and not equal. Up to 256 trace channels are 
available in a single chassis; however, chassis may be 
chained for greater widths. Either 4K or 1 6K deep trace 
memories are available at 25 MHz, 35 MHz, and 50 MHz. 

Trace synchronization is nominally provided via selec- 
tion of one of five clocks. Alternatively, each channel 
group (16 data channels/one clock per group) can be 
synchronized to compensate forelock delays, skewing, 
and multiple timebases. The DS3700 clocking scheme 
allows address (or data) to be delayed one clock cycle to 
align the address trace with its associated data. 

Symbols for trace disassembly and triggering are auto- 
matically created by HALE (Hilevel's Assembler). Addi- 
tional symbols may be defined and stored in the symbol 
table. The symbol table can be saved and restored for 
future use. 

The DS3700 has four triggering modes. 

Single Trigger: Single matchword defined across all 
address and data trace bits with don't care bits. 

External Trigger: A hardware input may be pro- 
grammed to act as a trigger, conditional trigger, or arming 
condition. 



Reprinted with permission from Hilevel Technology, Inc. 
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Multi-Level Trigger: Provides 1 6 levels of trace control 
with up to 4 conditions per level. Multiple commands 
(thirteen total) may be executed on the current cfock 
cycle in real-time for any of the 4 conditions. Trigger 
patterns may be specified across the entire address and 
data fields Including "don't care" bits. 

Unlimited Break Points:Provides either 16K, 64K, or 
1M of address breakpoints/triggers. 

The DS3700 provides 1 6 active user-defined trace dis- 
play headings and data formats. Any 4 bits of the trace 
data may be used to change display formats dynamically, 
in addition, symbols may be defined across the entire 
address and data fields and displayed along with the 
formatted data. 

Trace masking is achieved by entering mask addresses 
in a table and then toggling the trace mask function on or 
off. 

Trace permutations (as well as WCS permutations) are 
available to permute the order of display for clear presen- 
tations of the data. 

During debug, using the Interactive Trace Disassembler 
with the DS3700 allows viewing of both the formatted 
trace with symbols and the related source code with 
comments. 

Additionally, trace data may be displayed graphically as 
waveforms. Movement of linear cursors permit compari- 
son of waveforms and viewing of timing information. 

Microcode Performance Analyzer 

The TIM-1 E option provides an asynchronous ctock for 
time-tag and performance analysis operations. Resolu- 
tion of the ctock may be set to either 15 ns or 250 ns in 
three operating modes: 

Absolute Time: Allows elapsed time to be measured 
from any selected event; multiple reference points may 
be defined. 

Time Interval: Provides a measurement of the time 
inten/al between adjacent trace data or any locations in 
the trace buffer. 

Performance Analysis: Up to 15 groups of addresses 
may be defined as performance groups. 

Performance groups of addresses can be defined to 
generate statistical performance analysis histograms, 
address vs. frequency of address and address groups vs. 
time spent in groups, to allow the engineer to measure 
firmware efficiency. For example, time spent in subrou- 
tines, interrupt handlers, and in arithmetic functions can 
be measured. Dynamic graphing is available to actually 
view the performance in real time. 



Pattern Generator 

The PG201 Option allows the Emulyzer to function as a 
digital stimulus response tester. Sequential or pro- 
grammed vectors (or instructions) may be applied to the 
target and the response recorded. Using the Emulyzer 
Programming Language, the trace may be uploaded and 
compared to a known good file. The multilevel trigger 
may be used to set conditions for the pattern generator 
so that different vectors may be applied after a certain 
response has been recorded. The PG201 card also 
allows fast firmware-generated patterns to be inserted 
anywhere within the WCS. Walking ones, walking zeros, 
checkerboard, and random patterns may be merged with 
writable control store or used to fill the WCS. The PG201 
may be used to emulate a controller, such as the 
Am29PL141, which controls or sequences the target 
hardware. 

Hale - An Advanced ReUtgetable 
Macro-Meta Assembler 

• Includes Am29300 Definition Files 

• Increases User Productivity 

• Allows Coding Optimization 

• Pipeline Macros Ideal for Am29300 Blocks 

• Assembles on Several Computers 

• Relocatable Linkable Code 

• Matched to Development System 

HALE provides the microprogrammer with a set of facili- 
ties to rapidly create instruction sets and quickly write, 
assemble, and check his programs against design rules. 
For buikling custom instruction sets or emulating instruc- 
tion sets, HALE increases programming efficiency and 
gets the job done fast. 

HALE supports several programming techniques to 
accommodate varied programming styles and architec- 
tural requirements. Free-formatting, fixed-format instruc- 
tions, position-independent code, macros, and pipeline 
macros each provide specific programming benefits. 
Techniques are often mixed in programs to provkje the 
optimum control and ease of programming. 

Am29300 programmers using HALE receive the benefits 
of an assembler that altows source presentations (your 
actual instruction), comments, and symbolic debug when 
used with a HILEVEL DS Series Emulyzer. These inte- 
gration tools speed development. 

HALE is easy to use and is a quickly learned assembler. 
Generating productive code with HALE begins within the 
first few minutes of use. Straight fonward coding and 
simple definitions of powerful high-level macros permit 
code to be tested right away. 
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Pipeline macros allow the programmer to optimize the 
utilization of his hardware resources. By permitting 
macros for fields, combinations of fields, or abng func- 
tional boundaries, and allowing multiple invocations of 
the macros while the earliercalls are still generating code 
allows highly overlapped, and compacted code to be 
written. 

Pipeline macros are particularly useful for the Am29300 
series since they are designed along functional bounda- 
ries. Pipeline macros written for the multiplier 
(Am29C323), a floating point processor (Am29325), and 
an arithmetic logic unit (Am29332) in an architecture 
combining these resources would allow tight control and 
economy of code for their Independent and Interdepend- 
ent operations. 

Pipeline macros are well suited for n-stage pipelined 
architectures, DSP algorithms, pipelined multiplier op- 
erations, and adding programming elegance. Once pipe- 
line macros are written for an element, they are involved 
and closed out with two simple commands. Up to eight 
pipeline macros can be operated simultaneously. Pipe- 
line macros are position Independent. 

Calls to pipeline macros are limited only by the process- 
ing element's latency period, allowing maximum data 
flow processing. Pipeline macros also simplify coding for 
elements that introduce pipeline delays into the target 
hardware. 

Pipeline macros may contain conditional assembly state- 
ments allowing the automatic selection of microcode 
sequences for a given operation. 

User definable errors allow the microprogrammer to 
assert design rules and check his code against them. 
This saves time by catching errors during assembly 
rather than at debug and integration time. When mi- 
croarchitectural constraints change, the program may be 
reassembled with new rules and checked against them. 
Instead of searching for potential errors, valuable time is 
saved by the automatic detection of errors. 

User definable warnings allow the programmerto write 
non-assembling messages at any location In the source 
program. These messages may be usedto follow assem- 
bly program flows or flag untested routines. Incomplete 
cases within macros may be detected by inserting a 
warning message as the last case. If an undefined case 
is called, the warning will be displayed. Warning mes- 
sages assist the programmer In directing his attention to 
areas of concern and correcting them before they show 
up as problems during flnrrware Integratbn time. 

While and Endwhile looping directives allow code be- 
tween these directives to be generated as long as a user 
specified boolean equation is true. While A<B, While 
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A+B<C, and While A=B are examples showing the ver- 
satility of this directive. "While loops" may be nested up 
to 15 levels deep. "While loops" are also particularly 
useful in pattern generation applications. 

ASCII statements convert ASCII code to Its binary 
equivalent, which may then be imbedded within the 
microcode. Data may be coded directly into microcode in 
ASCII format. ASCII conversions are useful for passing 
messages, strings, or variables from one part of your 
target to another. 

Macro facilities allowthe assignment of a nameto either 
a single microinstruction or to a sequence of microin- 
structions. Macros allow parameters to be passed to 
points within the macro body. A multiply macro may 
consist of 100 lines of code, yet may be invoked by a 
single call (i.e., Mult A,B.). Macros permit the generation 
of assembly language foryourtarget or even higher level 
languages If one builds macros from macros. Macros 
may be nested up to 15 levels deep. Macros may call 
pipeline macros to generate extremely powerful code. 

Conditional assembly statements can be used to 
generate high-order Instructions that can accomplish a 
number of things based upon variable inputs: for ex- 
ample, executing either signed or unsigned functions, 
selecting the correct microcode for a specific task (auto- 
matic instruction selection), or interrogating the hard- 
ware and conditbnally executing different microcode 
sequences (context switching). Conditional assembly 
statement allows the constoictbn of powerful macros. 

String facilities are used to identify variables and com- 
pare entire or whole portions of strings with each other. 
When combined with other assembly directives, different 
routines based upon the results of the compares can be 
invoked. 

Expressions, operators, and modifiers albw versatile 
assembly program control. Addition, subtraction, multi- 
pllcatbn, division, less than, greater than, equal to, and 
combinations thereof can be used to generate and 
modify variables. Other commands available include 
shifting, negation, modub addressing, relative address- 
ing, and absolute addressing. 

HALE'S PROM formatter outputs In HILEVEL ASCII, 
AMDASM, DATA I/O, and Intel Intellec Hex to adapt to 
your specific PROM programrning needs. 

HALE albws the linking of rebcatable code so that 
several software modules may be developed in parallel, 
allowing completion of the programming task sooner. 

Over 4000 source and definition symbols allow virtually 
unlimited amounts of code to be written. Word widths of 
up to 256 bKs are supported accommodating highly 
parallel architectures. 
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Usa HALE to define instruction 
set and write applicabie software ■> 



Test system using 
DS3700 Emulyzer 



Use PATCHWORK to 
correct errors and 
pass tfiis information 
baci<toHALE 




Figure 5-7 



HALE runs on the IBM-PC/XT/AT, VAX, and Apollo 
computers. HALE runs all programs developed using 
AMDASM or MIcrotec Meta Assemblers, assuring the 
best possible return on your software investment. 

Software Tools for Firmware lr)tegration 
and Debug 

Patchwork for fast effective microcode changes 

Patchwork is an interactive assembler that permits the 
user to write the patches in assembly mnemonics and 
immediately test them. Temporary patches can be easily 
made and removed based upon the date they were 
made. Patchwork records each change, comments, date 
and time. Each change that creates new object code is 
appended to the listing and source files. In addition, a log 
file maintains a complete record of the entire editing 
session. 

Alternatively, the user can utilize the object code editor in 
the DS3700 to make changes in the microcode residing 
in the WCS. In this mode, the WCS data is displayed in 
the same format as the HALE Macro-Meta Assembler 
object code listing. 

Single-Step for tough debugging probiems 

The Single-Step program allows examination of the 
trace, source code, and comments together on a line by 
line basis. Each line shows what instaiction was exe- 
cuted and what in fact happened. Using Single-Step, 
problems stand out and solutions often become appar- 
ent. Invoke patchwork, make the desired changes, and 
Single-Step again. For programmers writing code or 
maintaining it, the line by line comments allow quick 
recognition and interpretation of the Instructions, thus 
reducing debug time. 



Formatted Trace for full speed debugging 

Formatted trace helps find errors that occur during real 
time execution. After a full speed ain Formatted Trace 
allows stepping through the trace buffer presenting 
source code and comments together. This allows fast 
identification of problem areas, and points to instructions 
causing problems. 

Trace Waveform for full logic analysis 

Trace Waveform conveys a visual historical record of 
target txjard ope ration at a glance. It allows converting all 
or any combination of trace channels into timing dia- 
grams. Labels may be assigned to each trace channel for 
clarity and recognition. A label file (containing the names 
of your traces) and a setup file (which holds parameters 
such as magnifications and scroll modes) can be cre- 
ated, saved and conveniently accessed in future uses. 
Cursor controls make comparison of non adjacent wave- 
form edges easy. Channel order may be permuted. 

Screen Driven 

The Hilevel Emulyzer provides screens for convenient 
system set-up and operation. Each screen may be con- 
figured, saved and restored by the operator or by the 
Emulyzer Programming Language. The full range of 
Emulyzer operations are contained within the screens. 
For example, the writing of multilevel trigger programs, 
setting the logical analyzers breakpoints, running and 
tracing the microcode program, and analyzing microc- 
ode performance. Each screen is designed for maximum 
utility and optimum information display. 

Automated Emulyzer Operation 

EPL (Emulyzer Programming Language) automates the 
Emulyzer operation through the use of high-level com- 
mands. EPL permits the execution of command files that 
are used to setup the development environment (down- 
load the WCS, download mutilevel trigger programs, 
download display format, etc.) and later save it. This 
allows multiple.users fast and easy access to the devel- 
opment system while managing their files safely. 

Microcode Quality Control 

l^icrocode Quality can be assured by repetitive testing. 
EPL provides commands that allow looping, uploading 
trace data and comparisons against known good files. 
Using EPL, extended tests can be used to catch illusive 
program bugs. 
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System Software 

Hilevel's system software allows ttie user to customize 
his development system. Keys may be assigned to 
invoke any program Including HALE, EPL, Patchwork, 
Single Step, Formatted Trace, and Waveform. Often- 
used keytward routines may be defined as keyboard 
macros and are invoked with a single keystroke. 

In-Circuit Emulators 

HILEVEL In-Circult Emulators are available for a variety 
of microcoded processors and support devices. Emula- 
tion is accomplished by placing the target device in a 
socket on the appropriate emulation pod and plugging 
the pod Into the device socket in the system. The pod is 
controlled by the EC1 000 controller, which can accom- 
modate up to four pods simultaneously. The EC1000 
features a built-in keyboard and LCD display to support 
stand-alone operation. 

The EC1 000 may be connected to the DS3700 Develop- 
ment System, allowing the microprogrammer to control 
the Emulyzer and review data using the development 
system console. Using the EC1000 in concert with the 
development system also takes advantage of the 
DS3700's multi-level triggering capabilities. 



All control and display capabilities necessary for compre- 
hensive device emulation are designed into the EC1 000: 

• Decimal, Hex, Octal, Binary, ASCII 

• Target single step or multiple step capability 

• Displays registers whose contents match speci- 
fied data 

• Allows changes to any part of any register 

• Allows control to be transfen-ed to DS3700 or 
VT1 00 compatible terminal 

• EEPROM allows customization of default para- 
meters 

• External trigger allows external logic or test 
equipment to halt the Emulator 

Emulation pods currently offered by Hilevel for Advanced 
Micro Devices are the Am2910 sequencer, Am29116 
ALU, and Am29PL141 Fuse Programmable Controller. 

For additional information contact: 

Hilevel Technology, Inc. 
18902 Bardeen 
Irvine, CA. 92715 
(714)752-5215 
TLX 655-31 6 



DS3700 SERIES SPECIFICATIONS 
Writable Control Store (WCS) 

Depth: IK to 64K; depending on 

memory configuration. 

Array Width: Oto 512 bits in 1G-bit 

increments. 

RAM Speed: 1 ns to 120 ns; 

depending on memory module 

selected. 

System Access Time: 25 ns to 1 40 

ns: depending on memory module and 

pod selected. 

Number of Independent Arrays: 16 

maximum. 

Target Control: Break (Halt), clear, 

single-step, continuous slow step, full 

speed emulation, break on event(s), 

PROM enable. 

Editing Modes: 

DS3700: Screen oriented editing with 

full search, scroll, page and window 

operation. 

DS3700/CS: Full Interactive Source 

Code Debug. 

WCS MEMORY MODULES: See 

following page. 

WCS INTERFACE PODS 

Logic Type: TTL, 10K ECL, or 100K 

ECL. 



POD Types: Data, Address, Master 

Pods. 

Output Signals: 

Data Pods: 1 6 Data bits per pod. 

Master Pods: 1 6 Data bits, clock 

enable, target reset, 2925 run control. 

Address Pods: Clock enable, target 

reset, ROM enable, 2925 run control. 

Signal Inputs: 

Address Pods: 16 Address bits, clock 

input. 

Master Pods: 1 6 Address bits, clock 

input, PROM enable. 

Target Connection: Connector or 

PROM socket. 

Type of Memories Emulated: ROM, 

PROM, SRAM. 

Additional Support: 

Registered Memories: Yes, with 

initialization. 

Chip Select/Chip Enable: Up to 3. 

Pod Size: 

Data and Address Pods: 0.75" H x 

2.75" W X 4" L 

Master Pods: 1 .5" H x 2.75" W x 4" L 



Logic State Analyzer (Trace) 
Number of Input Channels: 

DS3700 Mainframe. to 80 channels 
in 16 channel increments. 
DT37XX Mainframes: Oto 255 
channels in 16 channel increments. 
Maximum Clocl< Rate: 25 or 35 MHz; 
depending on type of trace memory 
selected. 

TRACE MEMORY MODULES: 

Model Depth Speed Width 

TRC/MLT-25 4K 25 MHz 16 bits 
TRC/MLT-35 4K 35 MHz 16 bits 
TRC16/MLT-25 16K 25 MHz 16 bits 

TRIGGER, BREAKPOINT AND 
TRACE CONTROL MODES 
Modes: External trigger, single event 
trigger. Unlimited break/trigger (USE 
option) and Multi-level trigger/trace 
control. 

Mode Combinations: Any combina- 
tion except single event trigger and 
multi-level trigger, can be used 
simultaneously. 



(continued on following page) 
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DS3700 SERIES SPECIFICATIONS (continued) 



External Trigger: 

Input: BNC connector 
Level: TTL 

Active State: Negative going transition 
Single Event Trigger: Single level 
condition specified across entire 
address and data fields. 
Unlimited Break/Trigger: 
Description: Address field can be 
used to specify trigger/breakpoint 
events for simultaneous monitoring. 
Address Range: Option Range 
UBE-16 16K 
UBE-64 64K 
Type of Trigger: Any address or 
address range may be specified as a 
trigger, conditional trigger or arming 
word. 

Multi-Level Trigger/Trace Control: 
Number of Levels: 1 6 Independent 
levels. 

Conditional Patterns: 4 per level 
across entire address and data fields. 
Condition Formats: Bit patterns with 
user defined format, and symbols (user 
defined or assembler generated). 
Boolean combination of symbols: 
Symbols may be combined with tlie 
following expressions: AND, OR, 
COMPLEMENT, NOT EQUAL 
Multiple Action Commands: Up to 9 
concurrent commands per condition 
Action Commands: 13; as shown 
below. 

1 . Trigger 

2. Conditional Trigger 

3. Arm Trigger 

4. Unarm Trigger 

5. Reset Trigger 

6. Disable Trace 

7. Enable Trace 

8. Override Trace Disable 

9. Disable Trace Mask 

10. Zero Timer 

11. Jump to level <N> 

12. Initialize loop/event counter 

13. Assert Pattern Generator 
Conditional Control 
Loop/Event Counter: Up to 65,535 
events 

Trigger Delay: to 4095 clock cycles 
Breakpoints: Independent on/off 
control 



TRACE MODES 

Modes: State analysis; State timing, 

absolute elapsed time; State timing, 

interval; Performance analysis and 

Dynamic performance grapfiing 

State Timing (absolute and interval): 

Resolution: 'SS ns or 250 ns, selectable 

Maximum Time: 

Low Resolution: 1 6 minutes 
High Resolution: 1 minute 
Using Trace control: 21 6 hours 

Performance Analysis (TIM-1 E and 

USE options): 

Number of Groups: 15 

Group definition: Any subset of the 

address range. 

Address Range: Option Range 
UBE-16 16K 
UBE-64 64K 

Operation: Logic analyzer stores 

group transitions. 

Display: Both histogram and absolute 

time chart. 

Histogram: Relative % of execution 
time used by each defined group. 
Absolute: Total execution time of 
each group. 

Group Name: Up to 1 5 characters. 

77m© Resolution: 15 ns or 250 ns, 

selectable. 

Dynamic Performance Graphing 

Number of Croups: 15 

Group definition: Any subset of the 

address space. 

Address Range: 64K 

Operation: Logic analyzer dynamically 

updates trace memory and displays 

graph of percentage of events within 

each group. 

Display: Histogram 

SYt/IBOUC TRACE 
Description: Symbols may be defined 
using entire address and data fields. 
Display: symbols will be displayed 
along with user formatted data. 
Use: symbols may be used for trace 
display, trace control/trigger condition 
statements, search/locate operations, 
and time interval measurements. 



Source: Symbols may be defined 
using DS3700 menu or downtoaded 
from HALE definition files. 
Maximum Characters per Symbol: 
15 

Maximum Number of Symbols: 
Depends on number of characters per 
symbol and width of data fields. >1000 
symbols with average of 7 characters 
when defined on address field 

TRACE MASK (USE OPVON) 

Description: Unconditionally masks 

from trace any user specified address 

or range of addresses. 

Maximum Mask: Any subset of 

address range. 

Address Range: Option Range 
UBE-16 16K 
UBE-64 64K 

TRACE PODS 

Logic Type: TTL, 10K ECL, or 100K 

ECL. 

Signal Inputs: 1 6 data bits, clock. 

Display Formatting 
DS3700: Any user selected combina- 
tion of hexadecimal, binary, and/or 
octal. 

DS3700/CS: Full interactive WCS and 
Trace Disassembly. 
Multiple Formats: Any 4 bits of each 
array and trace may be used to select 
between 16 user specified formats. 
User Defined Headings: 
Maximum number of characters: 256 
Multiple headings: Up to 16 to match 
multiple formats. 

Display Permutation: Any bit may be 
displayed in any position within WCS 
and Trace displays. 

DS3700 Mainframe 

WCS Size: Accepts up to 8 WCS 

memory modules (128 bits). 

Number of Arrays: One 

Trace size: Accepts up to 5 trace 

memory modules (80 channels). 



(continued on following page) 
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DS3700 SERIES SPECIFICATIONS (continued) 



Interfaces: 

RS232: 3 ports 
High Spaed Parallel: 1 port 
GPIB (IEEE-Std-488): 1 port (Op- 
tional) 

BNC Inputs: External clock, external 
trigger 

fiWC Outputs: Arm output, trigger 
output. 

Annunciation: Front panel LEDs show 
status of trigger, GPIB interface, 
clocks, and operational controls. 

DT37XX Mainframe 

WCS Size: None, requires EXP3700 

for WCS operation. 

Trace Size: Accepts up to 1 6 trace 

memory modules (256 channels). 



Interfaces: 
RS232: 3 ports 
High Speed Parallel: 1 port 
GPIB (IEEE-Std-488): 1 port (Op- 
tional) 

BNC Inputs: External clock, external 
trigger 

BNC Outputs: Arm output, trigger 
output. 

Annunciation: Front panel LEDs show 
status of trigger, GPIB interface, 
clocks, and operational controls. 

EXP3700 Expansion Chassis 

WCS Size: Accepts up to 1 6 WCS 
memory modules (256 bits). 
Number of Arrays: May be config- 
ured as one or two arrays. 



Operating Specifications 

(DS3700, DT37XX, EXP3700 chassis) 

Chassis Size: 7" H x 1 8" W x 23" D 

Weight: 60 to 70 lbs depending on 

options included. 

Operating Temperature: 15°Cto 

35-0 

Operating Humidity: 10 to 80 % RH 

Power Requirements: 90 to 

1 32 VAC, or 1 80 to 250 VAC; 50 or 

60 Hz. 

Warranty: 1 year limited warranty. 

For additional information contact: 

Hllevel Technology, Inc. 

18902 Bardeen 

Irvine, CA. 92715 

(714) 752-5215 

TLX 655-316 



WCS MEMORY MODULES 


Model 




Depth 




Emulation 


RAM Speed 




System Speed (ns)** 






IK 


4K 


16K 


PROM 


RAM 


(ns) 


25 


35 


40 


50 


90 


140 


E1K-10 


X 






X 




10 


X 












M1K-20* 


X 






X 




20 




X 










M1K-35* 


X 






X 




35 








X 






E4K-10 




X 




X 




10 


X 












E4KW-10 




X 




X 


X 


10 


X 












E4K-25 




X 




X 




25 






X 








E4KW-25 




X 




X 


X 


25 






X 








M4K-25* 




X 




X 




25 






X 








M4K-35* 




X 




X 




35 








X 






M4K-120* 




X 




X 




120 












X 


E16K-25 






X 


X 




25 






X 








E16KW-25 






X 


X 


X 


25 






X 








M16K-35* 






X 


X 


X 


35 








X 






M16K-70* 






X 


X 


X 


70 










X 




M16K-120* 






X 


X 


X 


120 












X 



*M Series memory modules requires EXP370-4 expansion chassis. 
"Access times specified at target side of pod. 
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5.4.4 Hewett-Packard 
Microprogram Development Support 

HP 64276 Microprogram Development Subsystem 

Description 

The HP 64276 Microprogram Development Subsystem 
and the HP 64320S 25 IVlHz Logic State/Software Ana- 
lyzer provide run control and real-time analysis for the 
AK/ID Am29300 family. As integrated subsystems of the 
HP 64000 Logic Development System, the HP 64276 
and the HP 64320S add the power of run control and 
analysis to all phases of the design, development, and 
maintenance of Am29300-based products. 

The Microprogram Development Subsystem consists of 
three components: a Run Control module, a Writable 
Control Store (WCS), and a 25 MHz Logic State/Soft- 
ware Analyzer. Run Control provides program flow con- 
trol, clock control, and break event detection. Writable 
Control Store provides high speed RAM for storing the 
microcode to be executed. A 25 MHz Logic State/Soft- 
ware Analyzer monitors systems buses and provides 
trigger, store, and sequencing functions for tocating 
problems in the microprogram. Integration of the Micro- 
program Development Subsystem with other powerful 
HP 64000 analysis and emulation tools allow for interac- 
tive, cross-triggered measurements in complex multi- 
processor environments. 

Features 

• The choice of clock control or real-time address 
jam at break detection offers flexible target 
system control. 

• Address ranging and two-level sequencing 
provide powerful break event specification. 

• Real-time, nonintaisive analysis of micropro- 
grammed system activity reduces software devel- 
opment time. 

• Flexible user-definable microassembler provides 
support for a wide variety of Am29300-based 
designs. 

• Microcode source interleaved with analyzer trace 
data speeds software debugging. 

• Linking of separately assembled microcode 
modules accelerates software turnaround time. 

• MACRO instruction feature of the microassem- 
bler improves software engineering productivity. 

• Modular architecture permits specific Writable 
Control Store configurations for customized 
development tool needs. 

• Integration of Run Control and analysis capabili- 
ties simplifies operation. 



• Interaction with other HP 64000 System Emula- 
tors and analyzers provides real-time analysis in 
multiprocessor environments. 

Run Control 

Run control provides system clock control, break 
event s pecification, and address jamming. These im- 
portant features improve debugging of Am29300- 
based systems. 

Architecture 

The Run Control module taps into the clock lines on the 
target system to obtain the greatest level of clock control. 
Clock control functions allow you to start and stop the 
clock, single step, and break on a specific clock edge or 
pattern. 

The Run Control module provides 20 I/O lines to probe 
the address bus, monitor status bits, or drive control 
lines. These I/O lines are bused internally to the Writable 
Control Store and the state analysis data probe connec- 
tors on the Run Control module. 

Both single lead or coaxial cable leads are supplied for 
probing the clock and control lines between the target 
system and the Run Control module. Coaxial leads are 
recommended for use with higher clock rates to ensure 
better signal quality. 

Clock Control 

Precise specification of clock edges and relationships is 
critical for breaking or halting the clock in target systems 
with multiple clock signals. The Run Control Module 
allows you to specify complex clock signal characteris- 
tics for use in break events. 

Address Jamming 

Address jamming forces program execution at a specific 
address if a starting point other than a system reset 
vector location is desired. For example, to force the 
execution of a monitor routine that displays the registers, 
an address is jammed onto the address bus, causing the 
program to jump to the nnonitor routine. With the HP 
64276 Microprogram Development Subsystem, you can 
jam either 8, 12, 16, or 20 address lines. 

Break Events 

The HP 64276 allows you to initiate a break event after 
the detection of any of the following occurrences: an 
address pattern {up to fourcan be specified), an address 
range, or a two-term sequence of an address pattern, 
range, or both . The state analysis trigger also can enable 
break event detection. When a break event occurs, an 
address can be jammed onto the address bus (e.g., to a 
monitor program) or the system clock can be stopped. 
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Writable Control Store 

The Writable Control Store (WCS), the memory array for 
the system microcode, consists of a dual port RAM that 
allows easy microcode downloading from the assembly 
environment and high-speed access of the microcode by 
the microprogram target system. Target system develop- 
ment and debugging is more efficient using the WCS 
Instead of the target system control store. 

Architecture 

The Writable Control Store (WCS) contains either one or 
two 32 l<byte memory boards. Each board can be config- 
ured Into one of three array sizes: (bits wide by words 
deep) 1 6 by 1 6K, 32 by 8K, or 64 by 4K. With two WCS 
boards In the subsystem, the microword widths are 
doubled. 

The WCS address Is obtained from the Run Control 
module, eliminating the need to probe the target system 
a second time. By using one of the WCS address lines as 
an enable control to three-state the WCS output, you can 
toggle between target menxjry and subsystem memory. 

Load 

Once microcode has been assembled and linked, it Is 
downloaded from the software development environ- 
ment to the Writable Control Store for execution. Trans- 
ferring microcode is fast and easy with the integrated 
development and hardware execution environments of 
the Microprogram Development Subsystem. 

List 

When debugging microcode, you can examine the con- 
tents of the WCS and list them to a destination file, a 
printer, or a display. A single list command specifies from 
one to four addresses or groups of contiguous WCS 
addresses. Displaying the address ranges allows you to 
examine and compare the microcode In different subrou- 
tines. 

Modify 

While debugging, you can modify the absolute code and 
continue debugging. Modify can be specified for up to 32 
bits at a time for either a single WCS address or a range 
of addresses. 

Save 

The absolute code stored In WCS can be saved to a disc 
file for later reloading or for verifying the correctness of 
changes to source microcode. 

User-defined 

You can design a custom WCS array and combine it with 
the other modules of the Microprogram Development 
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Subsystem. The combination of the HP 64000 Logic 
Development System, the HP 64276 Run Control, and 
the user-defined WCS array provides an integrated 
development solution for all Am29300 microprogram 
target systems. 

The user-defined WCS interface supports any array size 
between 1 6 by 51 2K and 1 024 by 8K (bits wide by words 
deep). The interface between the HP 64000 mainframe 
and the user-definable WCS consists of control lines and 
parallel address and data buses that allow data to be 
written to or read from the WCS. User-definable control 
sequences can be transmitted to the user's WCS preced- 
ing and following an upload or download operation. 

25 MHz Logic State/Software Analyzer 

The HP 64320S 25 MHz Logic State/Software Analyzer 
adds high-speed, real-time, nonintrusive software analy- 
sis to the HP 64000 Logic Development System. This 
flexible analyzer wori<s well in microprogram software 
analysis, general-purpose software analysis, and sys- 
tem integration. Measurement results are displayed in 
source microcode (including MACROS and comment 
lines) or in user-defined symbols that minimize the need 
to decode captured data. The analyzer can also refer- 
ence symbols from the microprogram source files for 
easy specification and Interpretation. 

Architecture 

The analyzercan be configured for 30, 60,or90 channels 
of data acquisition. Each configuration must have a 
control card and from one to three data acquisition cards 
containing 30 data acquisition channels. The following 
table contains the analyzer's configurations. 



Number of Input 
Channels 



Control 
Cards 



30-Channel 
Card 



30 


1 


1 


60 


1 


2 


90 


1 


3 



Format Specification 

The Format Specification establishes the conditions and, 
relationships of target system signals transmitted to the 
analyzer through the clock and data input channels. 
User-defined labels up to fifteen characters long can be 
assigned to signal groups from one to 32 contiguous 
channels wide. Saving the Format Specification to the 
disc eliminates respeclfying data channel labels, thresh- 
old levels, and clock characteristics each time the ana- 
lyzer Is used. After a label Is assigned to a group of input 
channels. It also appears on the analyzer softkeys. 
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To avoid confusion caused when both positive and 
negative true data are present in the system under test, 
the 25 MHz analyzer can automatically complement any 
group of data channels. You do not need to invert these 
signals on the target system or complement data as 
measurements are specified and results are interpreted. 

The analyzer has two separate clock inputs. Data can be 
captured on the positive and negative edges of both 
clocks. With two clocks, you can analyze systems with 
multiple CPUs by capturing data on each processor's 
address strobe signal. 

Data and clock signal switching threshold voltages can 
also be varied. Appropriate thresholds for TTL and ECL 
logic families have been preprogrammed. You can also 
select other values between -1 and +1 volts, in 1 00 mV 
increments for monitoring several different logic families. 
Independent threshold specifications can be made for 
each acquisition board (30 data channels). 

Map Specifications 

The Map Specification greatly simplifies measurement 
setups and trace data interpretation by replacing raw 
captured data with user-defined symbols. A "symbol 
m9p" can be associated with any labeled input channel 
via the Format Specification. Entries in a symbol map 
appear as part of the analyzer's softkey syntax and in the 
displays of measurement results. Map symbols are de- 
fined as constants, patterns, or ranges. A map symbol 
can be defined in terms of source file line numbers or 
user-symbols from microprogram source files. 

Trace Specification 

The Trigger function determines when the analyzer will 
capture data. Complex triggering conditions can be 
implemented using sequence terms. A 'lerm" is defined 
as "AND'ed" constants and patterns. A constant can be 
an integer, map symbol, or symbol from the micropro- 
gram source file. A pattern is an integer with embedded 
"don't cares" (e.g., OlOOxxxxB). Four sequence terms 
(trigger being the fourth) are available. Each sequence 
term can be set up to occur from 1 to 65,536 times before 
it is satisfied. A restart term is also available for resetting 
the sequencer. 

The Trigger Enable function specifies when the analyzer 
monitors data for a trigger event. The trigger event can be 
stored anywhere within the trace memory buffer, allow- 
ing trace data to be stored either preceding, surrounding, 
or following the trigger event, The Store function 
determines what data should be stored. You can specify 
up to four OR'ed terms with each term consisting of 



AND'ed constants and patterns. When the restart term is 
used for sequencing, the maximum number of OR'ed 
terms is three. The optional store with "sequence protect" 

specifies that the sequence events be saved before any 
pre-trigger events are stored. 

Measurement Results 

The HP 64320S 25 MHz Logic State/Software Analyzer 
provides a high degree of display flexibility. When using 
source display, the microcode is visible without having to 
probe the microword: microword fields, MACRO invoca- 
tions, and commentsf rom source files are displayed. The 
display shows these source level statements combined 
with target data probed bythe analyzer. This combination 
of program and data makes microcode debug more 
productive and efficient. Displays can also include user- 
defined symtx>ls specified in the symbol maps and can 
automatteally reference microassembler symbol tables 
generated during software development. These symtwls 
can be displayed in the trace listings. 

Flexible Probing Capability 

The HP 64320S analyzer's clock cable and two of its data 
probes plug directly into the HP 64276 Microprogram 
Development Subsystem to eliminate double probing of 
the Am29300-based target system. Run Control, WCS, 
and the other state analysis data probes connect to the 
target system by general-purpose wire grabbers or D- 
type coaxial cables. The coaxial cables offer better high- 
frequency signal quality and a more reliable connection 
to the target system. 

Measurement Involving Multiple Analyzers 

Measurements with the HP 64320S and other HP 64000 
analysis subsystems relate microcode execution to other 
software and hardware events. These interactive meas- 
urements are conducted via the high-speed intermodule 
bus (1MB). The 1MB carries the following five signals 
between the analysis subsystems: 



1MB Signal 


Received by 
HP 64320S 


Driven by 
HP 64320S 


Master Enable 


yes 


yes 


Trigger Enable 


yes 


yes 


Trigger 


yes 


yes 


Storage Enable 


yes 


no 


Delay Clock 


no 


yes 
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The Master Enable signal coordinates measurement 
starts witii other analyzer and emulators. When the 
analyzer is set up to receive this signal and the Master 
Enable is iaise," the analyzer is completely disabled and 
will not capture data. When Master Enable becomes 
"true," the analyzer begins examining data. 

The Trigger Enable operates in the same way as Master 
Enable by informing the receiving analysis module when 
it can begin looking for Its trigger condition. 

The Trigger signal, when received, causes the analyzer 
to immediately trigger and cbmplete its measurement. 
For example, this is valuable for using the HP 6461 OS 
high-speed Timing/State Analyzerinconjunction with the 
25 MHz Logic State/Software Analyzer to determine if a 
spurious signal pulse is related to a microcode event. By 
triggering the 25 MHz analyzer on a hardware event, the 
microcode execution surrounding the pulse is quickly 
pinpointed and evaluated. 

The Storage Enable signal exercises hierarchical control 
over the store specification. 

Microassembler 

The HP 64276 Microprogram Development Subsystem 
includes a user-definable microassembler and linker 
capable of generating microwords up to 128 bits in width 
which support Am29300 family devices. The linker al- 
lows assembly of separate modules, reducing turn- 
around time for source microcode changes. 



The definition language operates on a 32 bit, 40 register 
pseudo machine with standard instructions for the move- 
ment and manipulation of data. In addition, higher level 
commands for standard tasks are also provided (i.e., 
commands such as GET_TOKEN, FIND_DELIMITER, 
and GET_OPCODE support lexical analysis). The user- 
definable microassembler can also generate relocatable 
code, with the use of the GEN_CODE command. The 
ERROR and WARNING commands print messages 
from a fixed table to the listing file to simplify error 
detection and correction. Field names and their values 
are easily specified (e.g., SEQ = CONT). 

The definition language Is powerful enough to allow the 
creation of a customized microassembler capable of: 

• Generating code 

• Specifying default values for missing fields 

• Issuing errors for missing fields not having a 
default value 

• Issuing errors for overlapping field definitions 

• Issuing errors and warnings for architectural 
inconsistencies, such as a microinstruction that 
could cause bus contention 

The resulting customized microassembler recognizes 
the syntax specified in the definition stage. Standard 
capabilities are predefined for the microassembler and 
need not be explicitly specified in the definition stage. 
For example, standard pseudo-ops are provided for 
storage allocation, location counter control, and listing 
format control. In addition, a powerful MACRO facility 
is supported. 
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5.5 SIMULATION MODELS 

Logic Automation, Inc. 

Simulation Models for Hardware and 

Software Verification 

The freedom and flexibility that have always been the 
benefits of designing with microprogrammed devices are 
now supported by a new generation of computer-aided 
design tools. 

Advanced Micro Devices, Inc. and Logic Automation 
Incorporated have entered into a Library Development 
Relationship. This agreement has made it possible to 
model many of the latest AMD devices and make them 
available to designers. Table 5-2 includes all 
theAm29300 family. 

Many other Advanced Micro Devices models are also 
available from Logic Automation; the entire AMD model 
list appears at the end of this section. These simulation 
models have been developed by Logic Automation with 
the cooperation of Advanced Micro Devices. Each model 
is based on infomnation provided by AMD and verified 



with the same vectors that are usedtotest the actual part. 
Each model is a SmartModel, capable of performing 
usage and timing checks that will significantly improve 
your ability to debug, verify, and optimize your designs. 

SmartModel Simulation Benefits 

Simulation models from Logic Automation are called 
SmartModels because they are behavioral language 
models with built-in Intelligence. This concept— that in- 
formation about VLSI devices is most effective when it is 
available inside the models used to simulate complex 
systems— was introduced and pioneered by Logic Auto- 
mation. SmartModels allow you to use a workstation and 
logic simulatorto verify your designs atthesyslems level. 

Design cycles are shorter because the simulations catch 
many errors— both subtle and obvious— before the first 
prototype is built. Cycles are shortened because Smart- 
Model simulations are fast. They are easy to use and they 
are designed to maximize the effects of your simulation 
runs. Simulation mns are also critical as the first step in 
developing test vectors that must be used later to verify 
production systems. 



Table 5-2 



Description 



TIL 



32-Bit Integer Multiplier 
Floating Point Processor 
1 6-Bit Sequencer 
32-Bit ALU 
Register File 
Bounds Checker 
Byte Queue 



Am29325 
Am29331 
Am29332 
Am29334 
Am29337 
Am29338 



Architectural 



High Level 
Programming 



Microcode 
Development 



CMOS 



ECL 



Am290323 
Am29C325 
Am29C331 
Am29C332 
Am29C334 



Am29434 



Hardware Design I 
System Capture I 



Hardware & 
Microcode 
Integration 



System 
Verification 



Manufacturing 



Board/System 
Test Program 
Development 



t^iVuVaM^niiAfdV 



Figure 5-8. Microprogrammed Product Development Cycle (without simulating) 
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Figure 5-9. Microprogrammed Product Development Cycle (witli simulating) 



SmartModel Simulations Postpone Prototyping 

Without simulating, tiie microprogrammed product de- 
velopment requires iiardware prototype development 
very early in the process. As shown by the shading in the 
diagram's process blocks, Figure 5-8, only the overall 
design and hardware design (plus schematic capture) 
can be completed without breadboarding. Contrast 
this situation with the same process diagrammed in 
Figure 5-9. 

Simulating permits far more of the product development 
cycle to take place before the first hardware prototypes 
are necessary. First of all, the simulation takes the place 
of the breadboarded hardware that would have been 
necessary for Integration. In addition, short sections of 
code generated in a high level language using existing 
software development tools can also be executed in the 
simulation environment to help in the initial phase of 
system verification. 

SmartModel Simulations Are Fast 

Simulations with behavioral language models run fast. 
The demonstration circuit used below is a simple graph- 
ics processor designed using AMD's new 32-bit building 
block Am29300 family: the Am29331 sequencer, 
Am29332 ALU, Am29325 floating point processor, and 
two Am29334 dual-port register f i les. There are a total of 
39 ICs In the schematic including 4 Am29827 10-bit 
buffers, 12 Am29841 10-bit latches, and 8 Am27S35 
registered PROMs. In addition the design contains an 
abstracted behavioral language model of a display 
memory that is equivalent to eight SRAMs. 

Figure 5-10 Is a screen print of a simulation running 
under Mentor Graphics QuIckSIm 5.1. A timing diagram 
in a trace window occupies the width of the screen at the 
top. The QuIckSim menu window is below left; next Is 



a list window showing a few of the circuit lines against 
simulation time. In the lower lefthand corner, there is a 
transcript window containing messages written by one of 
the Smart Models in the circuit. The lower righthand 
corner of the screen shows the schematic. 

The circuit executes microcode out of ROM to plot the 
pixels that make up a line on a display. The pseudo-code 
for the line-plotting algorithm Is below. 

X, y, deltax, deltay <- FIFO (1,2,3,4) 
e <- 2 * deltay - deltax 
for i = 1 to deltax do begin 

plot (x,y) {XOR in pixel (x,y) into bitmap} 

if e > then begin 



y <- 

e <- 
end 
else 

e <- 

X <- 3i 

end for 



(2 * deltay - 2 * deltax) 



e + 2 

+ 1 



deltay 



Runon an Apollo DNOOO with Mentor Graphics QuickSim 
Version 5.1 , the circuit ran through that algorithm execut- 
ing the equivalent microcode at a rate of 34 microcode 
instmctions per minute at a 1 ns resolution. Note that this 
was an exercise of the entire design, a true system-level 
benchmart<. 

SmartModels Are Easy To Use 

SmartModel simulations are effective because these 
models are designed to make the most of every simula- 
tion run. For example, some users of simulation tech- 
niques have noted that analyzing computer printouts of 
logic values is tedious and very time-consuming. Using 
SmartModels eliminates that problem. During the Initial 
stages the models' functional checks pinpoint usage 
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Figure 5-10 



errors. Later in the design process, ttie liming checl<s are 
usually more pertinent. In botti cases, the models use 
messages on the wor1<station screen to pinpoint the 
exact problem by time and schematic instance. This 
unique feature of simulation models from Logic Automa- 
tion is called Symtx)lic Hardware Debugging. 

Symbolic Hardware Debugging is a series of checks 
which write error or warning messages in the transcript 
window during your simulation runs. There are two types: 
functional checks and timing checks. The function 
checks vary greatly with the device type, but essentially 

they help make sure a chip is being used correctly. For 
example, a DMA controller will include a check on 
whether or not all internal modes and registers were 
Initialized. A DRAM check will produce a message like: 
"WE was low at the RAS falling edge." 

The timing checks can include set-up, hold, frequency, 
pulse width, recovery time, etc., as applicable to the 
component and as specified by the semiconductor 



vendor's current data sheet. A 1 megabit x 1 DRAM 
model, for example, contains about 50 different timing 
checks. 

Both kinds of checks produce Symbolic Hardware De- 
bugging messages that are very specific. A setup time 
violation, for example, will cause an error message that 
documents: pin name; device, by instance, reference 
designator, and component name; sheet name; design 
name; simulation time; signals and edges, as appropri- 
ate; and setup times, both as they occurred and as 
required by the vendor's data sheet. 

Symbolic Hardware Debugging means your simulation 
ains give you answers, not just binary data which you 
have to painstakingly decode and compare to the IC data 
books. 

Messages like that during your simulation runs speed 
your design debugging and verification. In this case, a 
check for an illegal operation has been built into the 
model; the operation can occur If the first instruction in an 
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Figure 5-11. Symbolic Hardware Debugging in the AI\AD 
32-Bit Building Block Family SmartModels 

interrupt service routine is a stacl< operation. Besides a 
service routine that starts witli a stack operation, this 
error message might be caused by an Incorrect Intenupt 
vectorthat caused a jump to any location that contained 
a stack operation. Similarly, the Am29334 Smartl\/lodel 
will signal if the write address changes during a write 
cycle; the model will issue a warning and write the data 
to all the locations involved so that the simulation run can 
continue. Many other function checks are built into these 
models. Forthe Am29300 family SmartModels, there are 
setup and hold timing checks for each input pin except 
the clock. For the clock, there are pulse width and 
frequency checks built into the models. Pulse width 
checks for the Write Enable and Data Latch Enable pins 
are also written into the Am29334 model. 

SmartModels Make Your Simulations 
More Efficient 

SmartModels maximize your simulations because they 
are adept at handling X's (unknowns). Depending on 
where it occurs in the circuit, one unknown can spread 
X's throughout yoursimulation. When that happens.your 
run is less useful than it could be because later events are 
buried in X's. To gain more information, you fix the first 
problem and rerun the simulation. SmartModels are 
designed not to generate or propagate X's unnecessar- 
ily — with Symbolic Hardware Debugging, the use of X's 
can be very judicious. Our engineers anticipate when an 
"X" Is truly a "don't care" and keep your simulations useful 



as long as possible while always issuing a warning 
message to document the event. 

SmartModels Are Accurate 

The Logic Automation and Advanced Micro Devices 
Library Development Relationship means that AMD 
supplies our model builders with advance information 
and with the test vectors used for the actual chips. We 
use the test vectors to certify that the SmartModels are 
accurate simulations of the AMD components. 

SmartModels Represent Good Values 

Multiple Timing Versions 

Every SmartModel includes the correct timing for all 
available speed versions. An example is the Am29C323; 
the SmartModel for that part contains the Am29C323, 
Am29C323-1, and Am29C323-2 timing versions. 

Maintenance 

A maintenance agreement will keep your models 
current automatically. When CAE companies update 
their simulators and workstation operating systems, your 
models will be updated. Because Logic Automation 
wori<s with the CAE companies priorto the new software 
release, you will generally have new SmartModels in 
your hands before you're ready to upgrade your system. 
If you have a maintenance agreement. Logic Automation 
will also automatically update your SmartModels when 
the manufacturer changes specifications or adds new 
timing versions. 

Documentation and Support 

SmartModels are very easy to install and use. Full 
documentation is provided with each set ordered. In- 
cluded are: installation instructions; SmartModel Library 
Users Guide; data sheets on each model; and relevant 
application notes. Inaddition, ourApplications Engineers 
are ready to help you with any questions at 503-690- 
6900. 

SmartModels Are Available For Designs Now 

Logic Automation has more than 250 timing versions of 
about 100 Advanced Micro Devices components that mn 
on popular CAE wori<stations available now. 

EPROMs 

Am27128A 16Kx8 

Am27LS191 , included with Am27S191 

Am27PS191/A, included with Am27S191 

Am27S191/A/SA2Kx8 
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PROMs 

Am27S19/A32x8 
Am27S25 512x8 
Am27S291A2Kx8 
Am27S35/A1Kx8 
Arn27S37/A1Kx8 
Am27S45/A 2Kx8 
Am27S47/A 2Kx8 

Static RAMs 

Am2130 1Kx8, dual port 

Am2168 4Kx4 

Am2169, included with Am2168 

Am27519 64Kx1 

Am9114 1Kx4 

Am91 24, included with Ani91 1 4 

Am9128 2Kx8 

Am9150 1Kx4 

Am9151 1Kx4 

Am91L14, included with Am91 14 

Am91L24, Included with Am9ll4 

Am93L422 256x4 

Support 

Am29l 1 4 real-time interrupt controller 
Am2914 interrupt controller 
Am2952 8-bit bidirectional I/O port 
Am2953/A 8-bit bidirectional I/O port 
Am2965 octal driver 
Am2966 octal driver 
Am8237A DMA controller 
Am9513A system controller 
Am951 7A DMA controller 
AmZ8073 system controller 
AmZ8530 serial controller 

32-Bit Building Blocks 

Am29C323 32-bit multiplier 
Am29325 floating point processor 
Am29C325 floating point processor 
Am29C331 1 6-bit sequencer 
Am29331 1 6-bit sequencer 
Am29332 32-blt ALU 
Am29C332 32-bit ALU 
Am29334 register file 
Am29434 register file 
Am29337 bounds register 
Am29338 byte queue 



Bit-Slice Family 

Am2901B/C 4-bit slice 

Am2902A carry/took-ahead 

Am2903A 4-bit slice 

Am2909 microprogram sequencer 

Am2910/A microprogram controller 

Am29116/A 16-blt microcontroller 

Am2911A microprogram sequencer 

Am2940 DMA address generator 

Am2942 timer/counter/DMA address generator 

Am29520 pipeline register 

Am29521 pipeline register 

Am2960 error detection and correction 

Am29C10 microprogram controller 

Am29L1 1 6, included with Am291 1 6/A 

IVIultlpliers & ALUs 

Am25S557 8-bit multiplier 

Am25S558 8-bit multiplier 

Am29C323, see 32-bit building blocks category 

Am29332, see 32-bit buikJing blocks category 

Am29516 16-bit multiplier 

Am29517 16-bit multiplier 

Am29L516 16-bit multiplier 

Am29L517 16-bit multiplier 

Programmable Logic Devices 

AmPAL18P8 PAL 

AmPAL22V10/APAL 

Am29PL141 fuse programmable controller 

Am29800 Family 

Am29806 6-bit Chip select decoder 
Am29809 9-bit equal-to comparator 
Am29818 shadow register/WCS pipeline register 
Am29821/A/Am29C821 10-bit register 
Am29822/A 10-bit register (inverting) 
Am29823/A/Am29C823 9-bil register 
Am29824/A 9-bit register (inverting) 
Am29825/A 8-bit register 
Am29826/A 8-bit register (inverting) 
Am29827/A/Am29C827 10-bit bus buffer 
Am29828/A/Am29C828 10-bit bus buffer (inverting) 
Am29833/A/Am29C833 parity bus transceiver 
Am29834//\/Am29C834 parity bus transceiver 
(invert register) 

Am29841/A/Am29C841 10-bit bus interface latch 
Am29842/A 10-bit latch (inverting) 



5-52 



CHAPTER 5 
Support Tools 



Am29800 Family (continued) I^Aodels are added every weei<, so call to get the latest 

Am29843/A/Am20C843 9-bit latch '^^^^°5 or price and delivery information: 

Am29844/A 9-blt latch (inverting) Logic Automation Incorporated 

Am29845/A 8-bit latch P- °- Box 31 

Am29846/A 8-bit latch (inverting) Beaverton, OR 97075 

Am29853/A/Am29C853 parity bus transceiver ^el: (503)690-6900. Fax: (503)690-6906. 

(noninverting latch) East Coast sales office: 

Am29854/A/Am29C854 parity bus transceiver p^^,^ y^^^ Office Building, Suite 400 

(inverting latch) ., Q^g^ ^-^^^ Patuxent Parkway 

Am29861/A/Am29C861 1 0-bit transceiver Columbia, MD 21 044-3502 

Am29862/A 1 0-bit transceiver (inverting) Tel: (301 )740-8704. 
Ann29863/A/Anii29C863 9-bit transceiver 
Am29864/A 9-bit transceiver (inverting) 
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5.6 C COMPILER SUPPORT 

Introduction 

With the advent of the Am29300 Family, It has become 
relatively easy to design bit slice systems controlled by 
very large amounts of microcode. 

When It is expected that a fair amount of application 
microcode must be written, when speed of application 
development is important, or when some measure of 
portability is desired, then a microcode compiler can be 
an invaluable, if not essential, tool. 

In this section, we discuss compiler implementations 
from two different angles. To begin with, we will discuss 
some of the decisions to be made when Implementing a 
compiler for a specific architecture. Then we will discuss 
what hardware features are desirable to support the im- 
plementation of a compiler. 

Before going any further, we should note that we do not 
believe that a microcode compiler can by itself provide a 
complete solution to the problem of writing code for t)it 
slice systems. If you want to implement a general pur- 
pose language, you must design a general purpose 
processor. If you have not designed a general purpose 
processor, then it may be pointless to try to implement a 
compiler for your hardware. Even if your hardware is an 
ideal target for a compiler, there will inevitably be a need 
to code some small portion, at least, in assembler. In 
short, a microcode compiler is a tool, but not a panacea. 

The Microcode C Compiler 

The language we use is called Microcode C. It is similar 
enough to the C language that a programmer who 
already knows C can start programming in Microcode C 
after as little as one day's study. 

The Microcode C compiler must be customized, which 
basically means that we have to write a code genera- 
tor for your hardware, after making certain design deci- 
sions based on your needs and the capabilities of your 
hardware. 

The compiler generates micro-assembler code as its 
output. If you already have a microcode assembler, then 
we can arrange to generate the mnemonics used by your 
assembler. Othenwise, we can generate code for Bit Slice 
Software's standard microcode assembler. 

To date, we have developed about 12 different Microc- 
ode C compilers. These have variously been installed 
under PC-DOS, VMS, and/or Unix. 



Types 

All Microcode C compilers support a common data type 
-the signed integer whose width corresponds to the width 
of the processor. Typically, the width Is 16 or 32 bits. 
Usually the types short and long are treated the same as 
Int. Stnjctures, unions, and arrays are supported, but 
sometimes with restrictions. 

Other types are supported if desired and if the hardware 
permits. The type char can be reasonably supported if 
the basic memory architecture allows byte addressing. 
Since most microarchitectures use word oriented ad- 
dressing, char is most often simply treated as Int. The 
type unsigned can be supported if condition codes for 
unsigned comparisons are efficiently implemented. The 
types float and double are usually implemented only if 
there is floating point hardware to support them. How- 
ever, they can also be implemented if software floating 
point routines are written. 

Storage class 

All Microcode C implementations support the storage 
class static. The auto storage class is only supported if 
the hardware allows a reasonable implementation of a 
run time stack. If it is not possible to support a stack, then 
local variables (which are normally allocated on a stack) 
are treated as static and recursive calls are not allowed. 
The extern storage class Is supported if the assembler 
for which the compiler is generating code supports exter- 
nal references and definitions. 

Most micro-programmers lay great stress on maximizing 
theiruse of the machine registers. Microcode C supports 
their desires by allowing them to declare variables with 
register storage class. Microcode C allows registers to 
be declared globally, as well as locally. Local register 
variables must be saved when a function call is made. 
Gfobal registers never need to be saved or restored. 
They can be used to pass data between procedures in 
registers. 

Initialization 

The standard C syntax for static initialization of variables 
Is supported. 

Expressions 

Each implementation supports all the standard C opera- 
tions defined for its supported types. Binary operations 
supported include integer addition, integer subtraction, 
logical left and right shifts, bitwise and, bitwise or, bitwise 
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exclusive or, logical and, and logical or. Unaryoperations 
include take address, indirect through address, one's 
complement, logical negation, integer negation, and pre- 
and post- increment and decrement. Integer multiplica- 
tion, division, and remainder are supported when the 
micro-architecture encourages them. 

Statements 

All of the standard C statement types are supported, 
including for, while, do, go to, switch, if, else, break, 
continue, case, and default. The switch statement will 
generate a jump table if the micro-architecture permits. 
The compiler also supports a switchf statement, which 
is like a switch except that it does not do a bounds check 
on the switch value before passing it through the jump 
table. Use of switchf instead of switch can save four or 
five micro-instructions if the switch value is known to be 
or forced to be in the range of the switch. For systems 
whose sequencers (such as the Am29331) have a hard- 
ware loop counter, the compiler supports a loop state- 
ment, whichisvery useful for coding fast inner loops. For 
Am29331 -based systems, the compiler allows loop 
statements to be nested. 

Built-in functions 

Each micro-architecture has a unique interface to exter- 
nal buses, registers, and signals. Each Microcode C 
implementation supports this interface by providing a set 
of built-in hardware functions designed specifically for 
the particular implementation. These built-in functions 
behave like macros in that they are expanded in-line. A 
basic set of built-in functions might include: 



data = input( source); 
output( sink, data); 
cc( condition_code ); 
memcycle(type); 



- gets data from an external 
register 

- sends data to an external 
register 

- tests a hardware condition 
code 

- initiates a memory cycle 



In this case, "source", "sink", "condition_code", and 
"type'would bie chosen from a set of constantscontained 
in a standard file supplied with the compiler. Any special 
timing constraints (such as "you must wait two cycles to 
read back data after cycling the memory") are enforced 
automatically by the compiler. 

One of the advantages of using built-in functions, as 
opposed to adding new keywords to the language, is that 
it is possible to debug microcode programs on the host 
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system using the standard C compiler, simply by writing 
a small library of functions which are equivalent to the 
built-in ones and which simulate the operation of the 
target hardware. 

Scratctipad RAM 

In order to allocate non-register variables, there must be 
some sort of an external scratchpad memory accessible 
to the compiler. When reference is made to a non- 
register variable, the compiler automatically generates 
the micro-operations needed to set up the address and 
write out or read back the data. 

Compaction 

All microcode compilers must do some form of compac- 
tion in orderto take advantage of the parallelism usually 
inherent in the micro-architecture. l\4icrDC0de C uses 
resource-based compaction on straight line code seg- 
ments. Operations are compacted in the order that they 
are generated by the compiler. An operation can be 
moved to precede a previously compacted operation if 
there is space for it and if no resource dependencies are 
detected while trying to move it. 

In-iine asseml)ler code 

If it is necessary to code key sections of a program in 
assembler, the compiler allows the user to include as- 
sembler code in-line . I n order for in-li ne micro-assembler 
code to share data with compiled code, there is also a 
mechanism for in-line code to refer to register variables 
by the names they were declared with (rather than by 
number). 

The overall aim is to provide a conpiler which is inexpen- 
sive to build, simple and robust in construction, and can 
be relied upon to generate correct code. Although the 
compiler does take care of a great many housekeeping 
details (such as register number assignment and 
"constant folding"), it does not attempt to perform com- 
plex global flow analysis and optimization. Instead, the 
burden of doing so is placed on Jhe pnagrammer. Fortu- 
nately, the C language is designed to permit you to 
perform in source code the kinds of optimizations that 
optimizing compilers usually do. For instance, it is easy 
to recede array references in inner loops to use pointer 
operations instead. 

There are many advantages to using Microcode C to 
write microcode. Programs are more readable, more 
comprehensible, and more maintainable. The use of a 
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high level language dramatically increases productivity 
and makes It much, much easier to try out different 
approaches during software development. 

Hardware Design Considerations 

If you are in thefortunate position of being in the process 
of designing new hardware and you want to know how to 
make it easy for a compiler to produce code for it, here 
are a few ideas. 

ALU 

To begin with, it is always nice if the ALU supports "three 
address code", which means you can add register A to 
register B and place the result in register C in one 
instruction. 

Second best, but also acceptable, is two address code, 
in which you add register A to register B and place the 
result in register B in one instruction. 

In general, it is preferable for compiling purposes if any of 
the following can be accomplished in one instruction: 

add a register to a register 

move the contents of a register to a second register 

add a constant to a register 

Although these would seem to be fairly simple things to 
do, it is suprising how many micro-architectures are 
unable to carry them out. You should not get the idea that 
it would not be possible to generate a microcode compiler 
for a given micro-architecture if it cannot perform the 
operations outlined above in one instruction. We recog- 
nize that many other factors, such as cost and board 
space, must be taken into account in your particular 
design and we are well aware of the dangers of over- 
specifying a design. 

For two address architectures, you should try if possible 
to avoid putting any restrictions on the second address, 
such as "the upper two bits of the second address must 
be the same as the upper two bits of the first address". 
Such restrictions can be worked around successfully, but 
they can be a rich source of bugs and are acceptable only 
if you are sure that the saving of a couple of bits in the 
microword will be worth all the trouble it will cause to both 
compiler writer and micro-programmer! 

Constant Field 

Most micro-architectures provide at least one constant 
field in the micro-instruction word. This field is set with 
constant data for the sequencer (jump addresses) orthe 
ALU. This field should be at least as wide as the maxi- 
mum of the sequencer address width and the data 
address width. In the best of all possible worlds, it should 



also be as wide as the ALU and internal data paths. On 
a machine with a 32 bit ALU, it may be too expensive to 
resen/e 32 microword bits for a constant fiekJ. One 
solution is to resen/e only 16 bits and load all constants 
in two steps (load an upper data register from the con- 
stant field and then source the constant field combined 
with the upper data register). This solution can be made 
somewhat more satisfactory if it were also possible to 
treat the 1 6 bit data field as a 32 bit number in one or more 
of the following ways: 

zero extend the 16 bit constant on the left 

zero extend the 16 bit constant on the right 

sign extend the 1 6 bit constant on the left 

Sequencer 

In order to implement jump tables for SWITCH state- 
ments and to altow computation of addresses for indirect 
function calls, it is desirable if an address for the se- 
quencer chip can be computed in the ALU . Typically this 
can be done by providing an external register which can 
be written to from the ALU's Y bus and then read into the 
sequencer using its "direct" inputs. 

Similarly, if the sequencer contai ns a loopcounter(as the 
Am29331 does), it would be nice if it could be loaded with 
an arbitrary value computed at mn time in the ALU. This 
could be done using much the same mechanism as 
described above. 

For branching within the microprogram, it is most desir- 
able if there is a field in the micro-instruction which is big 
enough to hold the maximum microcode address. It 
should be possible to branch to an arbitrary microcode 
tocation in one micro-instruction. The address should be 
in one contiguous field of the micro-instnjction. Although 
these ideas may seem obvious, we have seen several 
systems which ignored them. For instance, one system 
required the branch address to be toaded into a special 
register, with the actual jump in a subsequent instruction. 
Another system used a 4 bit "page register" with a 1 2 bit 
sequencer to address a 1 6 bit microcode address space. 
Although it was feasible to develop acompiler for both of 
these systems, the hardware design made all branches 
relatively expensive in the first case and all subroutine 
calls relatively expensive in the second case. 

In order to achieve the maximum possible instruction 
rate, most systems are designed so that a conditional 
branch in one instruction is made based on condition 
codes computed in the immediately previous instaiction. 
In some systems, all condition codes are latched in a 
register at the end of the first instruction, so that any one 
can be tested in the second. In other systems, the 
condition code to be tested is selected at the end of the 
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first instruction and only the one selected bit is latctied, In 
order to save a couple of chips. A microcode conpiler 
can be made to cope with either way of doing things, 
although the first is preferable. 

In general, compiled code cannot always benefit from 
this pipelining of ALU and sequencer operations. A nice 
feature, which you might consider including in your 
design, would be to have an extra bit in the instruction 
which, when set, would cause the cycle length to be 
doubled. If the condition code were available halfway 
through the double cycle, then it would be possible to 
code a conditional test and a branch in the same instruc- 
tion. Although this would not save any time, it would save 
on expensive microword space. 

Floating Point 

It is a relatively simple task to generate code for low 
latency parts, such as the Am29325. 

Integer Multiplier 

Multiplications are often generated by compilers during 
subscript calculations, if the size of the object being 
subscripted |s not a power of 2. Inorderof increasingcost 
and speed, there are three ways to provide for multipli- 
cation in a bit slice design. The cheapest is to simply use 
the integer ALU to perform the standard shift and add 
algorithm, which costs one machine cycle per result bit 
(e.g. 32 cycles for a 32 by 32 bit multiplication). The next 
option is to provide a multiplier which can multiply ad- 
dress offsets, but not data, in one cycle. For instance, if 
the data p?rths were 32 bits, but the address width was 
only 16 bits, you could provide a 16 by 16 bit multiplier. 
This would take one cycle to compute a 1 6 bit offset, but 
would require four cycles to compute a 32 bit result. The 
fastest option is to use a multiplier, such as the 
Am2gC323, which can handle either address or data 
calculations in one cycle. 

Scratchpad Memory 

In orderto be able to declare non-register variables, there 
must be a memory somewhere to hold them. In most 
systems, this ta'kes the form of a small, fast, local mem- 
ory. In others, the bit slice processor uses memory on the 
main system bus. 

If the memory is on the main system bus (a VME Bus or 
a Multibus, for instance), then it is usually a byte address- 
able memory. If your processor is to perform only word 
accesses on such a memory, then you might consider 
setting up the addressing so that the processor puts out 
a word address to the bus interface, which converts the 
address to a byte address. Forinstance, suppose the bus 
has 24 address lines. If you use byte addresses in the 
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processor, then any time some C code needs to do the 
subscript calculation 

a[i]. 

it has to multiply the subscript by the size of the object 
being subscripted. Although, this multiplication can be 
converted into a shift if the size is 1 6 or 32 bits, this 
still imposes an unecessary penalty for such a routine 
operation. A better scheme (for a processor whose word 
size is 16 bits) would be to use 23 bit addresses in the 
processor and have the bus interface in effect shift the 
address left by one and always supply a least 
significant bit of zero. For a processor which Is 32 bits 
wide, you would use a 22 bit address in the processor, 
shift the address by two, and force the two least signifi- 
cant bits to zero. 

Multiple Memories 

One of the fundamental features of C is that it assumes 
that all memory accesses are identical and that a pointer 
can point to any addressable memory location. This 
makes it very tricky to support a system with memories 
with overlapping address spaces. For instance, if you 
have a pointer stored somewhere and you want to 
indirect through it, there are two problems. Rrst, you 
must identify the memory in which the pointer is stored. 
Second, you must identify the memory to which the 
pointer points. 

In most bit slice designs, the problem of overlapping 
address spaces usually comes up in one of two ways. 

In the first and simplest case, memory address space 
overlap almost always occurs with control store memory 
and scratch pad memory. However, it is easy to tell which 
is which if control store memory contains only code and 
scratch pad memory contains only data (which may 
include pointers to functions in control store memory). 

In the second case, the problem may arise if the hard- 
ware can operate on a host bus, such as a VME bus. 

While it is conceptually possible to support an architec- 
ture featuring multiple memories of different granulari- 
ties, the implementation of the concept would add a great 
deal of complexity to the code generator, because ob- 
jects have different sizes in different memories. For 
instance a structure in one memory would have a differ- 
ent set of offsets to its members than the same structure 
in a memory with different granularity. 

Usually, when Microcode C is implemented on a proces- 
sor, one memory is picked to be the default system 
memory, as far as the microcode is concerned. All 
declared variables are stored in this memory. Space is 
also allocated within the memory for the run-time stack, 
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if that is required fortlie implemerrtation. All addressing 
operations generate addresses in tliis memory. All indi- 
rection operations (Including array/structure/union refer- 
ences) generate addresses wittiin tfiis memory. 

Built-in Functions 

Ottier memories {if any) are treated as peripheral devices 
and built-in functions are implemented to support them. 
For instance, a very common configuration might include 
a word-addressed 4K static memory and an interface to 
a byte-addressed VME bus. A Microcode C implemen- 
tation for such a machine would designate the static 
memory as the main memory. The VME bus would be 
supported by a set of built-in functions, such as 

set_vme_address( expr ); 

result = read_vme ^bus(); /* at address */ 

write_vme_bus_byte( expr ); /* at address */ 

write_vme_bus_word( expr ); /* at address */ 

write_vme_busJong( expr ); /* at address */ 

The disadvantage of this scheme is that it makes it 
impossible to use C structure references to refer to such 
external data. However, it does make It easier to support 
some of the more esoteric irtterfaces, such as those 
which support pre-fetching of data through FIFOs. 

Addressing 

In general, the ALU should be at least as wide as the 
memory address register of the main system memory. If 
it is not, then it is necessary to resort to either segmenting 
the address space or using very expensive double preci- 
sion integer arithmetic for all address calculations. Nei- 
ther of these two alternatives is very attractive! 

In some micro-architectures, the main integer ALU 
handlesallthewori< of generating memory addresses. In 
others, there is a separate functional unit, often featuring 
pointer and offset registers. These units are usually very 
effective for the special purposes for which they are 
designed but often lack certain fundamental functionality 
which is very useful to the C compiler. 

The main deficiency, which we have seen in some 
systems, is the lack of the ability to generate an address 
based on taking a constant offset from a pointer register, 
without writing the resultant address back into the pointer 
register. 

Given that MAR stands for "Memory Address Register" 
and that "constant" cou W be negative, the basic function- 
ality which is desirable for the compiler would include 

MAR = constant 

MAR = arbitrary expression result 



MAR = pointer register + constant 

MAR = pointer register -i- arbitrary expression result 

pointer register = constant 

pointer register = arbitrary expression result 

Note that this by no means excludes additional function- 
ality, such as offset registers or multiple MARs. An actual 
hardware implementation couW provide several vari- 
ations on this scheme, such as providing operations in 
which a small constant is irrplicit in the operation, rather 
than having to be placed into a literal field. This allows 
certain memory addressing operations to be combined 
with operations which use the literal field. 

To efficiently support pre-increment and pre-decrement 
operations we add 

MAR = pointer register = pointer register -i- constant 

To efficiently support post-incremement and post-decre- 
ment operations, we add 

MAR = pointer register 

pointer register = pointer register + constant 

with the sense that this is done in one operation. 
The Staci( 

Since the slack pointer (SP) is simply a dedicated pointer 
register, all the operations on pointer registers described 
above also apply to the SP. 

Most modern microprocessors reserve two registers to 
control the stack: the SP (which points to the top of the 
stack) and the Frame Pointer (FP) which points to the 
base of the current stack frame. The use of the FP allows 
a compiler to use stack offsets which are constant irre- 
spective of how much has been pushed onto the stack 
(for temporaries or called function arguments). 

In the interest of avoiding extra overhead on function 
entry and exit and at the expense of some extra internal 
housekeeping, the Microcode C compiler dispenses with 
the use of an FP and uses the SP only. The disadvantage 
of not keeping a separate FP is that the task of generating 
a stack trace back becomes much more complicated. 

Bit Slice Software 

321 Auburn Drive 

Wateriex), Ontario, N2K 2X7 

(519)885-4313 

© 1 987 by R. Preston Gurd 
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5.7 WRITABLE CONTROL STORE 

5.7.1 Agility 

AG-11B Microprogram Development 

The AG-1 1 B combines with your IBM personal computer 
to create a complete development station for micropro- 
gram-based designs. Its high performance and very low 
cost open new design opportunities for using flexible bit 
slice, ASIC, DSP, and 32-bit building block architectures. 
The AG-1 13 provides high speed in-circuit emulation of 
your design target's ROM or PROM. 

Writable Control Store 

The heart of the AG-1 1B Is the Writable Control Store 
module (WCS) resident within your IBM PC. Each WCS 
has a memory array 96 bits wide by 4096 words deep 
which can be increased in width and/or depth with addi- 
tional modules to suit virtually any size microprogram- 
med application. You microcode is loaded into WCS 
memory using your personal computer and AG-1 IB 
software. The WCS utilizes high-speed static RAM 
which provides a 50 ns maximum access time to your 
target. 

Configurable Buffer Interface and Software 

The AG-1 IB offers maximum flexibility in configuring 
for your particular design. The WCS Interfaces to your 
target through the Target Interface Board. The hard- 
ware is complemented by the AG-1 1 B software, which 
allows easy software control of your configuration vari- 
ables. The AG-1 IB software, which is either menu- 
driven or command-line driven, provides control of 
breakpoint and target control signals and complete WCS 
card diagnostics. 

mcASM Microcode Assembler 

Included optionally with the Ag-1 1 B is the mcASM Struc- 
tured Microcode Assembler. Developed as a joint effort 
between Microtec Research and Advanced Micro De- 
vices, this assembler features macro support, design 
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rule checking, nonpositional keyword syntax, and relo- 
catable segments. mcASM lets you define your target's 
architecture and assembly mnemonics, and then pro- 
duces executable microcode for your target in a format 
that is easily loaded into the WCS. 

Applications 

Microprogrammed architectures are increasingly used to 
boost performance in applications such as graphics, 
peripheral controllers, communications, military, robot- 
ics, and industrial automation. The AG-1 1 B supports all 
architectures which use microprogramming, including bit 
slice as well as ASIC, DSP, and 32-bit building block 
devices. And since it is not designed for any specific 
architecture, the AG-1 IB is adaptable to any micropro- 
grammed product. 

Cost and Time Savings 

TheAG-IIB: 

• uses the computing power of an inexpensive 
IBM PC 

• comes at a fraction of the cost of other micro- 
code development stations 

• is a cost-effective way to set up multiple 
development stations so that microcode devel- 
opment work can proceed in parallel 

• lets you avoid the time and expense of burning 
new PROMs after each change to your micro- 
code 

• Increases the productivity and morale of 
firmware engineers 

• is available immediately and can be set up 
quickly and easily 



For more information, contact Agility, 1290 Lawrence Station 
Road, Sunnyvale, CA 94089, (408) 744-0806. 



Reprinted with permission from Agility 
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Bipolar building blocks 
deliver supermini speed 
to microcoded systems 



As CMOS processes start to encroach on the 
performance of bipolar circuits, bipolar 
Itechnology is taking the next step to 
keep itself in the lead for the highest speed 
systems. A family of five bipolar VLSI com- 
putational circuits— fabricated with a scaled, 
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ion-implanted, oxide-isolated process and three 
levels of metal interconnections for high den- 
sity—provides a set of functionally partitioned 
microprogrammable VLSI building blocks for 
systems such as superminicomputers, digital 
signal processors, high-speed controllers, and 
many others. The modularity of the system 
functions ensures that the chips can meet the 
performance requirements of a general- 
purpose superminicomputer, as well as those of 
an im^e processor, which are radically differ- 
ent from each other. 

Included in the family are three parts that 
form the core of a general-purpose micro- 
programmed system: a 32-bit arithmetic and 
logic unit (ALU), a 16-bit microprogram 
sequencer, and a 64-by-18 four-port, dual- 
access RAM. And, for systems that do a large 
number of multiplications or floating-point 



Reprinted with permission from Electronic Design, November 1 5, 1984. Copyright 
1984, Hayden Publishing Co., Inc. 
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operations, two performance accelerators— a 
32-by-32-bit multiplier and a 32-bit floating- 
point processor will be available to tie onto the 
buses (see Design Entry, p. 246). 

The chips offer high performance, a flexible 
architecture, and microprogrammability, and 
even address the problem of fault detection for 
data integrity. These circuits can thus support 
an extremely fast microcycle — about 80 ns 
(projected). That high speed is the result of 
several design considerations: Each part is de- 
signed internally with emitter-coupled logic 
but has TTL-compatible inputs and outputs. 
Second, more power was allocated to the logic 
circuits used in the critical paths than for logic 
in the noncritical paths on each chip, to max- 
imize the speed. Third, by integrating highly 
specialized logic on chip it is possible to execute 
very complex operations in a single cycle. 

The microprogrammability of this chip set 
offers several benefits to the system designer. 
It provides a structured and systematic ap- 
proach for implementing the control mech- 
anism of the system, and like the bit slices, it al- 
lows the instruction set to be customized to suit 
the designer's application (see "Architectural 
Limitations of Bit Slices," opposite). And 
several versions of the initial design can be 
tested, or current designs can be enhanced 
simply by changing the microcode. 

Thus, the functionally partitioned Am29300 
family overcomes all of the performance penal- 
ties of bit-slice structures, while maintaining 
its ability to form a wide variety of architec- 
tures. Even though the chips are designed to 
work together as a family, each can also be used 
independently in an application that requires 
its unique capabilities. 

Pip«linMareoul 

The flexibility of the Am29300 family is 
largely due to a decision not to place pipeline 
stages within the functional blocks. Not includ- 
ing the pipeline registers inside incurs some 
off -chip delays. This is a small price to pay to al- 
low system designers to optimize the pipeline 
structure for their individual needs. Moving the 
register file out of the functional block for the 
ALU also slows things down. At the same time 
it does not force a fixed register size on the user, 
enabling systems to be created with dedicated 



registers, register windows, or register banks — 
all with neither fixed depth nor width. 

Additionally, the high level of integration 
helps eliminate the propagation delays often 
encountered when signals must go from chip to 
chip. The use of VLSI also results in fewer parts 
at the system level, which, in turn, conserves 
power (usually many watts in the case of bi- 
polar systems) and board space. Lastly, a com- 
plete 32-bit solution is provided for applications 
that require increased precision for arithmetic 
operations, high memory bandwidth, and a 



Architectural limitations 
of bit slices 

The limited performance of bit-slice circuits can 
be improved by increasing the width of the slices. 
That higher level of integration results in higher 
performance by reducing the number of off -chip 
delays while preserving the flexibility that has 
made bit-slice systems so attractive. However, as 
higher levels of integration become possible, two 
inherent problems with bit-slice architectures 
will limit their ultimate speed. The first involves 
the off-chip delays inherent in cascading. For ex- 
ample, the carry chain is usually the slowest path 
of an ALU. Breaking this chain between slices in- 
troduces off -chip delays into the critical path. 

The second problem is that the functional needs 
of many systems do not slice well. Barrel shifters 
and prioritiiers are especially difficult to cascade. 
Unfortunately, the ability to perform N-bit shifts 
and locate the position of leading Is are of greatest 
importance in applications that require heavy 
number crunching and manipulation of data 
fields, such as image processing, graphics, data- 
base management, and controllers. These are pre- 
cisely the applications whose need for speed forces 
the use of bit-slice devices. The system per- 
formance is compromised not only because these 
operations must be done bit by bit, but also be- 
cause many high speed algorithms cannot be effi- 
ciently implemented. 
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DESIGN ENTRY 
Microprogrammable 32-bit chips 



large addressing capability (4 billion bytes) to 
support virtual memory systems (Fig. 1). 

The performance of a system depends, not 
just on its raw computing speed, but on its abili- 
ty to respond to events such as interrupts and 
traps. For example, the Am29331 sequencer re- 
sponds to both interrupts and traps at the mi- 
croprogram level very quickly, and its response 
is completely transparent to the interrupted 
microroutine. Also, the Am29332 ALU indirect- 
ly supports the handling of these events by al- 
lowing its internal state to be saved or restored. 

The Am29332, a noncascadable 32-bit-wide, 
ALU, provides fast number crunching, high 
data transfer rates, and powerful bit-manip- 
ulation capabilities. Intended to be used with 
the Am29334 dual-ported RAM, which serves 
as an external register file, the ALU has two 



32-bit input buses (DA and DB) and one 32-bit 
output bus (Y). 

Internally, the device has a 32-bit data path 
that interconnects its various functional 
blocks. These blocks include various shifters 
and multiplexers, a mask generator, a funnel 
shifter, the ALU proper, a priority encoder, a 
parity generator and checker, a master-slave 
comparator, and the status and Q registers 
(Fig. 2). The ALU proper has three 32-bit in- 
puts: R, S and M. The R input comes from the 
funnel shifter, the M input from the mask gen- 
erator, and the S input from a variety of sources 
—the DA or DB buses, status register, or the Q 
register. 

The power and flexibility of the Am29332 
comes partly from its ability to perform oper- 
ations on various data types. It can operate on 



inputs 
Am29331 
sequencer 



Pipeline 
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1. A conventional CPU, built with Ain29300 building blocki, forms th* focal point ol an 
•xtramaly compact lystam that cycia* a» fact a* 80 n*. 
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variable bytes, variable-length bit fields, or sin- 
gle bits. This is made possible by the internal 
mask generator, which creates a 32-bit mask 
for each instruction (with no time overhead). 
The mask is used as an additional operand in 
each instruction to allow the operation on only 
selected data widths. 

The type of mask generated depends on the 
type of instruction. For instructions that oper- 
ate on variable bytes (1, 2, 3 or 4 bytes) the mask 
is a fence of Is (bit aligned) for all low-order 
selected bytes with a fence of Os for all high- 
order unselected bytes. Instructions that oper- 
ate on variable-length bit fields require a mask 
that is a string of contiguous Is for all selected 
bit positions and Os for all unselected bit posi- 
tions. In cases where the field exceeds the 32-bit 
boundary, the mask does not wrap around, thus 



allowing operation on a contiguous field across 
a word boundary. For instructions that operate 
on a single bit, the mask is a 1 for the selected bit 
position and Os for the other unselected bits. 

For most single-operand instructions, the 
unselected bit positions pass the corresponding 
bits of the operand unmodified. For most two- 
operand instructions, the unselected bit posi- 
tions pass the corresponding bits of the operand 
unmodified on the DB input. Thus, for two- 
operand instructions the mask allows the 
merging of two operands in a single cycle. In ad- 
dition to being used internally, the mask can be 
sent out over the Y bus, permitting the gener- 
ator to be used as a pattern generator for test- 
ing purposes. 

To speed various mathematical and logical 
operations, many circuits have started to in- 
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elude a barrel shifter, which has an N-bit input 
and an N-bit output. The barrel shifter would 
be used to shift or rotate the operand either up 
or down from to N bits in a single cycle. Such 
high-speed shifting is very useful in operations 
such as the normalization of a mantissa for 
floating-point arithmetic or in applications in 
which the packing and unpacking of data are 
frequent operations. 

However, a more useful circuit is a funnel 
shifter, which can be thought of as having two 
N-bit inputs and one N-bit output. Just such a 
circuit (with 32-bit-wide ports) was included on 
the 29332. The circuit can perform ail the oper- 
ations of a barrel shifter with capabilities ex- 
tended to two operands instead of one. In addi- 
tion, it can extract a 32-bit contiguous field 
across its two operands, a function very useful 
in several graphics applications. And any of its 
operations can be followed by a logical oper- 
ation, with both completed in a single cycle. 

Setting the priorities 

Prioritization, useful to control N-way 
branches, perform normalizations, and in 
graphic operations such as polygon fills, can 
readily be handled by the ALU chip. The built- 
in priority encoder sends out a 5-bit binary 
weighted code that signifies the relative posi- 
tion of the most-significant 1 from the most- 
significant bit position of the byte width se- 
lected. That allows prioritization on either 8-, 
16-, 24-, or 32-bit operands. The priority encoder 
output can be passed on to the Y bus or stored in 
the status register. 

If, for example, prioritization is used to nor- 
malize a mantissa during a floating-point 
arithmetic operation, it requires two cycles. In 
the first, the mantissa is prioritized to deter- 
mine the number of leading Os that need to be 
stripped off. In the next cycle, the mantissa is 
shifted up by the amount specified by the prior- 
ity encoder output. 

Relevant information for each operation per- 
formed by the chip is stored in the 32-bit status 
register after each microcycle. Each byte of the 
status word holds different information. The 
least-significant byte holds the position spec- 
ifier. The next most-significant byte holds the 
width specifier and three other bits that are 
used to test the comparison of unsigned and 



signed operands. The next byte contains the 
Carry, Negative, Overflow, Link, Zero, M and S 
flags. The M flag stores the multiplier bit for 
multiply or the sign compare bit for signed di- 
vision, and the S flag stores the sign of the par- 
tial remainder for unsigned division. The most 
significant byte stores the nibble carries for 
BCD operations. 

The states of the Carry, Negative, Overflow, 
Link and Zero flags are available on the status 
pins, and the status multiplexer allows the user 
to select either the status of the previous in- 
struction (register status) or the status of the 
current instruction (raw status) to appear on 
the status pins. The raw status could be used to 
update an external macro status register. This 
also allows branching at either the micro- or 
macro-level. 

The Q shifter and Q register are primarily 
used to assemble the partial product or partial 
quotient in multiplication and division oper- 
ations. Variable bytes of the status and Q reg- 
ister can either be loaded via the DA and DB 
inputs or can be read over the Y bus. Thus sav- 
ing and restoring of the registers allows effi- 
cient interrupt handling after any microcycle. 
It is also possible to inhibit the update of both 
these registers by asserting the Hold pin. 

Powerful and orthogonal instructions 

The power of the ALU chip's instruction set 
comes directly from the integration of several 
functional blocks mentioned earlier. The com- 
mands are symmetrical as well as orthogonal, 
to make it easier for a compiler to generate effi- 
cient code. Thus, any operation on the DA input 
is also possible on the DB input, and each in- 
struction is completely independent of its data 
type. 

Three-fourths of the instruction set consists 
of variable byte-width (one, two, three or four) 
operand instructions. The byte-width is se- 
lected by two bits in the instruction. For these 
operands, the instruction set supports all con- 
ventional arithmetic, logical and shift oper- 
ations. Arithmetic operations can be per- 
formed on both signed and unsigned binary 
integers. 

Additionally, the instruction set supports 
multiprecision arithmetic such as addition 
with carrying and subtraction with carrying or 
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borrowing. For all subtract operations it pro- 
vides the convenience of using borrowing in- 
stead of carrying by asserting the borrow pin. 
In this mode the carry flag is updated with the 
true Borrow. To allow efficient execution of 
macroinstructions the chip contains a Macro 
mode pin. When the chip asserts this pin, it al- 
lows the external Macro-Carry and Macro-Link 
bits instead of their microcounterparts to part- 
icipate in the operation. 

Instructions that execute algorithms for the 
multiplication and division of signed and un- 
signed integers are multiple cycles are also pro- 
vided. For multiplication, the circuit supports 
the modified Booth algorithm, yielding two 
product bits in one cycle. Both single-precision 
and multiprecision division of signed and un- 
signed integers are supported at the rate of one 
quotient bit in every cycle. 

Besides binary integers the instruction set 
provides basic arithmetic operations for 
binary-coded decimal (BCD) numbers. By oper- 
ating directly on the decimal numbers created 
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in most business applications, significant pro- 
cessing time is saved by eliminating the need to 
convert from binary to BCD and vice versa. 
Also, the round-off errors involved in con- 
verting from one base to the other are elimi- 
nated. 

The last group of instructions was created to 
support variable-length bit fields (1 to 32) and 
single-bit operands. The position and width of 
the field can be specified by either the position 
and width inputs or by fields in the status reg- 
ister, thereby saving bits in the microcode. 
Most of the time, the position and width are 
determined dynamically. It is therefore diffi- 
cult to supply them via the microinstructions. 
For single bit operations only the position spec- 
ifier is needed. 

Bit-manipulation instructions include set- 
ting, resetting, or extracting a single bit of the 
operand or the status register. Logical oper- 
ations on either aligned or nonaligned fields in 
the two operands include OR, AND, NOT and 
XOR. In the case of nonaligned fields it is as- 
sumed that at least one of the fields is aligned to 
bit position 0. It is also possible to extract a field 
from one operand and insert it into another 
operand or extract a field across two operands. 

Enhancins ayatam integrity 

The growing need for data integrity has been 
addressed at both the system and the chip level 
by including hardware for fault detection. Dur- 
ing calculations, byte- wide even parity is gener- 
ated for the data result by the ALU and stored 
with the data in the external RAM. Byte-wide 
even parity is also checked at the ALU inputs 
and any error is flagged. 

Even parity is specifically used to check for a 
floating TTL bus. Thus, all interchip connec- 
tions are checked out. In addition, hardware for 
functional verification is also provided on the 
sequencer and the ALU functional verification 
can be implemented by using two similar de- 
vices in the master and slave mode (Fig. 3). In 
that setup, both chips perform the same oper- 
ation, with any difference in their outputs being 
flagged as an error. The slave-mode chip's bidi- 
rectional buses operate in their input mode, al- 
lowing the master to compare its own internal 
result with that of the slave on every cycle. Ad- 
ditionally, the master checks the output bus to 
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make sure that no other device is turned on at 
the same time. 

As mentioned earlier, the ALU architecture 
was designed to use an external register file. 
Keeping the file external to the chip permits the 
user to expand it to meet any system need. The 
Am29334, a high-speed 64-word-by-18-bit dual- 
access RAM, provides two independent data in- 
put ports and two independent data output 
ports (Fig. 4). Each port can be read from or 
written to using the separate inputs and out- 
puts. The two accesses are independent except 
for the case when simultaneous write opera- 
tions are done to the same word— in which case 
the result is undefined. The read address inputs 
and the write address inputs of each side are se- 
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4. The dual-access RAM serve* as an external reg- 
ister file for the arithmetic processor chip. The 
Am29334 holds 64 words, each 18 bits long. Two 
chips are often connected to build a RAM blocic with 
four data outputs, two data inputs, and six address 
lines. Each port of the RAM can be independently 
accessed to read or write. 



parate in order to save the cost and time delay 
of external multiplexing between a read ad- 
dress and a write address. 

The word width of 18 bits allows the RAM to 
store two bytes plus a parity bit for each. Each 
side has separate write enable for the lower and 
upper nine-bit bytes and a common write en- 
able that also switches the address multiplexer. 
The actual write is delayed internally to allow 
the write address to set up internally before 
writing starts. 

It is possible to build a RAM with four data 
outputs, two data inputs and six addresses by 
using two dual-access RAMs and on each side 
connecting the data input, write address and 
write enables of one RAM in parallel with the 
corresponding inputs of the other RAM. This 
expanded RAM may be used in concurrent pro- 
cessing applications in which an ALU and an 
adder (which generates the address) do their 
computations— this yields a result and an ad- 
dress in parallel. The two values can then be fed 
simultaneously to the multiport memory. 

The sequencer controls the show 

The cycle time of the microprogrammed sys- 
tem is dependent on both the control path (i.e., 
sequencer and microprogram memory) and the 
data path (i.e., register file and ALU). Tradi- 
tionally, the system bottleneck has been the 
control path, especially the ciritical paths asso- 
ciated with conditional branching. Special care 
has been taken in the design of the Am29300 
family to balance control and data-path timing. 

A key device contributing to the improved 
control-path timing is the Am29331 16-bit mi- 
croprogram sequencer. It is designed for high 
speed, and that speed has been attained by the 
elimination of functions that would slow down 
the microaddress selection and by including the 
test logic and the test multiplexer in the se- 
quencer (Fig. 5). As in most previous generation 
sequencers, the address register, the incre- 
menter, the address multiplexer, the stack, and 
the counter are standard functions. The se- 
quencer has multiway branch instructions that 
allow 1 of 16 consecutive addresses to be se- 
lected as the branch target in a single cycle. 

The address register in most other sequen- 
cers is called a program counter, but this name 
is not correct if a strict definition is applied. In 
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the Am29331, the incrementing counter is 
placed after the address register, which thus al- 
lows for the handling of traps. The stack stores 
return addresses, loop addresses and loop 
counts. It has 33 levels to permit the deep nest- 
ing of subroutines, loops and interrupts. An 
output, Almost Full (A-Full), indicates when 28 
or more of the levels are in use. 

Available for use in iterative loops, the 
counter can be loaded with an iteration count at 
the beginning of a loop, and the count is tested 
and then decremented at the end of the loop. 



The loop is terminated if the count is equal to 
one; otherwise a jump to the beginning of the 
loop is executed. 

There are three buses that carry microad- 
dresses. The bidirectional D bus can be con- 
nected to the pipeline register, providing 
branch addresses or loop counts, or used for 
two-way communication with the data process- 
ing part of the system. The A bus, called an al- 
ternate bus, can be connected to a mapping 
PROM to provide starting microaddresses for 
instructions in a computer. The Y bus sends out 
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selected microaddresses to the microprogram 
memory and accepts interrupt or trap address- 
es if interrupt or trap is employed. 

Four sets of 4-bit multiway inputs provide a 
simultaneous test capability of up to 4 bits. 
And, one way to use those inputs would be to 
decode mode bits in chai^ng positions in mac- 
roinstructions. The four select lines select 1 
of 16 tests to be used in conditional instructions. 
There are twelve test inputs. Four of these may 
be used for C (Carry), N (Negative), V (Over- 
flow) and ZJZero), generating internally the 
tests C-KZ, C -H Z, N XOR V, and N XOR \+Z, 
which are used for comparison of signed and 
unsigned numbers. 

Relative addressing was the only somewhat 
useful function that was removed in order to 
maximize speed. The sequencer supports inter- 
rupts and traps with single-level pipelining, but 
may also be used with two levels of pipelining in 
the control path. It has a 16-bit-wide address 
path and cannot be cascaded, which thus limits 
the addressable memory depth to 64 kwords of 
microcode. That, however, is sufficient for the 
vast majority of applications— a typical 
computer, for instance, that has a micropro- 
grammed instruction set, might use only about 
1 to 2 kwords. However, for systems in which 
the microprogram is the sole program level, its 
size is generally larger. 

Microprogram interrupts supported 

The Am29331 sequencer supports interrupts 
at the microprogram level. Like polling, inter- 
rupts handle asynchronous events. However, 
polling requires explicit tests in the micro- 
program for events, thus leading to long re- 
sponse times, lower throughput, and larger mi- 
croprograms. Interrupts, on the other hand, 
have a response time equal to the cycle time of 
the system (approximately 80 ns), measured 
from the Interrupt Request input (INTR). The 
sequencer accepts interrupts at every micro- 
instruction boundary when the Interrupt En- 
able input (INTEN) is asserted. 

An actual interrupt turns off the Y bus driver 
and asserts the Interrupt Acknowledge output 
(INTA), which should be used to enable an ex- 
ternal interrupt address onto the Y bus, thus 
driving the microprogram memory. The inter- 
rupt also causes the interrupt return address to 



be saved on the stack; this permits nested inter- 
rupts to be handled (Fig. 6). 

The Am29331 is also the first sequencer that 
can handle traps. A trap is an unexpected situa- 
tion caused by the current microinstruction, 
which must be handled before the microin- 
struction completes and changes the state of 
the system. An attempt to read a word from 
memory across a word boundary in a single cy- 
cle is an example of such a situation. When a 
trap occurs, the current microinstruction must 
be aborted and re-executed after the execution 
of a trap routine, which will take corrective 
measures. 

Execution of a trap requires that the se- 
quencer ignore the current microinstruction 
arid push the trap return address— the address 
of the ignored microinstruction— on the stack. 
The trap address must be transferred onto the 
Y bus at the same time. All this can be accom- 
plished by disabling the carry-in to the incre- 
menter (Gn) and asserting the Force Continue 
input (FC) and the Interrupt Request input 
(INTR). 

Also built into the sequencer is an address 
comparator, which allows detection of break- 
point in the microprogram. An output signal 
from the comparator indicates when the con- 
tent of the comparator register is equal to the 
address on the Y bus. There is an instruction 
that loads the comparator register from the D 
bus and enables the comparator, which may lat- 
er be disabled by another instruction. 

Parallel microprocesses are useful when the 
system must deal with peripheral devices that 
are controlled at the microcode level. Normally 
only one processor is present and it must be 
time multiplexed between the concurrent oper- 
ations that must be performed. When a process 
is suspended its private state must be saved, so 
that it can be restored when the process re- 
sumes execution. That, in turn, requires that 
the state of the sequencer be saved and re- 
stored, or each process must have its own 
sequencer that is active when the associated 
process is active. The first approach is the least 
expensive, but the second offers the advantage 
of shorter response time, because no time is 
spent on saving and restoring the state. 

The Am29331 supports the first approach 
with its bidirectional D bus, through which the 
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entire state, with the exception of the com- 
parator register, can be saved and restored. The 
sequencer also supports the multiple sequencer 
arrangement, in which the three-state Y buses 
from the sequencers are tied together driving a 
single microprogram memory. One of the se- 
quencers is active, while the remaining sequen- 
cers are put on hold by asserting their Hold 
inputs. The Hold input disables most outputs 
(the D bus synchronously), disables the incre- 
menter, and enables an internal Force Con- 
tinue. This effectively detaches the sequencer 



from the system and preserves its state. 

The sequencer has a 6-bit instruction input 
that is internally decoded to yield a set of 64 in- 
structions. There are 16 basic branch instruc- 
tions, each in an unconditional version, a condi- 
tional version, and a conditional version with 
complemented test. In addition there are 16 
special instructions like Continue and Push C 
(push counter on stack). The branching instruc- 
tions handle jumps, subroutines, various kinds 
of loops and exits out of loops, and FC actually 
overrides the instruction inputs with a continue 
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instruction. FC is useful in field sharing and system debugging, in which, for example, the 

support for writable microprogram memory. contents of the counter and the stack may be 

The Am29331 is one of the few sequencers examined and altered. By including the trou- 

where the stack is accessible from outside bleshooting instructions in the microcode, the 

through the bidirectional D bus. This indirectly sequencer may aid in debugging itself and the 

allows access to the whole state of the se- rest of the system. The access to the state is also 

quencer except the comparator register. This is useful for changing context or extending the 

useful when testing the device, and during stack outside. o 
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This application note describes ttie design of a high performance microprogrammed 
32-bit processor using the Am29300 family of 32-blt building blocks. Basic design 
philosophy for a microprogrammed processor is discussed as the design choices 
made for this system are explained. Support circuitry used with the Am29300 family 
components is also covered in detail. This circuitry Includes: Writable Control Store, 
Serial Shadow Register diagnostics, and Programmable An-ay Logic. 
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SECTION 1 

Overview 



This application note describes ttie design of a high 
performance microprogrammed 32-bit processor using 
the Am29300 family of 32-bit building blocks. 

Basicdesign philosophy fora microprogrammed proces- 
sor is discussed as the design choices made for this 
system are explained. Issues of microprogram sequence 
control, interrupt handling, microprogram memory op- 
tions, microword layout, macroprogramming, high speed 
multiply, and clock control are covered. 

Support circuitry used with the Am29300 family compo- 
nents is also covered in detail. This circuitry includes: 
Writable Control Store, Serial Shadow Register diagnos- 
tics, and Programmable Array Logic. 

The use of the following Advanced l^/licro Devices com- 
ponents is illustrated in extensively documented ex- 
amples: 



Am29331 
Am29332 
Am29334 
Ani29C323 

Am29325 
Am29114 
Am29800 

Am29PL141 
AmPAL18P8 

AmPAL22V10 

Am9151 
Am99C165 



- 1 6-bit Address Sequencer, 

- 32-bit Arithmetic Logic Unit, 

- 64 x 18-bit Four Fort Register File, 

- 32-bit Parallel (Integer) Multiplier 
Accumulator, 

- 32-bit Floating Point Unit, 

- Interrupt Controller, 

- Family of Interface and Diagnostics 
Logic Devices, 

- Fuse Programmable State Machine, 

- Programmable Output 20-pin Combi- 
natorial PAL, 

- Output Macrocell 24-pin PAL, 

- Registered RAM with SSR™, 

-16K x 4-bit CMOS high speed 
RAM. 
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Figure 1-1. System Components 



SSR is a trademark of Advanced Micro Devices, Inc. 
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SYSTEM LAYOUT 

As with all processors, this system contains three main 
portions: Central Processing Unit (CPU), memory, and 
input/output (I/O) (see Figure 1-1). 

The CPU consists of a control section and a data section: 

The data section manipulates data via operations such 
as addition, subtraction, shifting, merging, multiplication, 
and division. These functions are implemented with the 
Am29332 Arithmetic Logic Unit (ALU), Am29325 Float- 
ing Point Processor (FPP), and Am29C323 Parallel 
Multiplier (PM). The data section also stores operands 
and Intermediate results in Am29334 register files. 

The control section directs the operations performed by 
the data section and determines the order in which the 
operations are performed. This section contains the 
Am29331 Microprogram Sequencer, macro opcode 
register & decode, interrupt control logic, microcode 
control store, control decoding logic, and control multi- 
plexers for the register file and ALU. 

The memory contains a 16K word by 36-blt static RAM. 
Included as part of the memory block are two address 
registers/counters, which may be used to speed up 
sequential reads and writes made by the CPU. 

The I/O portion is a simple connection to a host system's 
address and data bus. It Is assumed that the Am29300 
demonstration system operates as a peripheral proces- 
sor to a larger host system, as might be the case with an 
array or digital signal co-processor. Information to be 
processed by the demonstration system is loaded into 
the memory portion via Direct Memory Access (DMA). 
When processing of the data is complete, the host 
system unloads the memory portion via DMA. 

A diagnostics port is also provided as part of the I/O 
section. This port allows control over the demonstration 
system clock for single stepping, and it allows for serial 
diagnostics to display and control the state of the system. 

Throughout the remainder of this application note, it is 
assumed that the reader has some previous experience 
with microprogrammed processor design and is familiar 
with the Am29300 family data sheets. For those readers 
not familiar with microprogrammed design, some refer- 
ence material is listed in Appendix A. 

DATA FLOW 

The system data paths are illustrated in the block dia- 
gram of Figure 1-2. 



Memory and I/O Sections 

Information processed by the Am29300 system is ex- 
changed between the host system and the memory via 
the external bus interface. The information may be both 
data and macroinstructions. 

From the external bus, the host system is able to address 
the memory via the bus driver connected to the memory 
address bus. Data is moved over the memory data bus. 
The host system's only access to the Am29300 system 
Is via these buses to the memory. Therefore, all data to 
the systemflowsthroughthe memory via DMAaccesses 
by the host system. 

Diagnostic control and information flows through the 
external bus Interface via the host interface controller. If 
controls the clocking and single stepping of the system 
while loading and reading serial diagnostics via Serial 
Shadow Registers (SSR) that are placed in key locations 
throughout the system. 

(SSR is a trademark of Advanced Micro Devices, Inc.) 

Data Section 

Data must be moved from the memory to the register file 
to be available to the ALU and multipliers for processing. 

The register file has four access ports, two ports for 
writing data into the file and two ports for reading data out 
to the ALU and multipliers. This arrangement allows two 
operands to be read from the file in the same cycle as two 
operands are being written. The two read operands are 
usedeitherasAandBoperandsfortheALU, FPP,orPM, 
or as address and data inputs to the memory. 

To move data from the memory to the register file, an 
address to the memory is selected from the register file 
on the A read port. This address selects a word from the 
memory that is transferred on the memory data bus to the 
B write port of the register file. 

Once data is loaded Into the register file, it can then be 
selected for use on either the A or B read ports for input 
to the ALU, FPP. or PM. 

Data processing results from the ALU, FPP, or PM are 
then placed on the Y bus for return to the register file A 
write port. 

Finally, processed data is moved back to the memory via 
the B read port of the register file, while the location to be 
written in the memory is addressed by the value on the A 
read port of the register file. 
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Figure 1-2. Am29300 Demonstration System 



(NOTE: The advantage of using both write ports on the 
register file Is that it is possible to perform calculations 
and write the results via the A write po rt at the same time 
that new data is being moved intothe registerfilefromthe 
memory via the B write port. This will be illustrated In 
more detail later in this document.) 

Control Section 



D Bus 

The D bus is a highway for information flow between the 
microcode control store, interrupt control sequencer, and 
data section of the CPU. 

Branch addresses or constants from the microcode can 
pass to the sequencer via the D bus. The intermpt 
controller's interrupt vector base address register may 
also be loaded via the D bus. 

Constants from the microcode can pass to the data 
section for use in calculations via the D bus to A bus 
transceiver. Microcode constants can also be used as 



addresses to the memory, via a D bus to A bus to memory 
address bus connection. 

Variable data can be passed from the registerfile to the 
sequencer. The sequencer can also return data to the 
register file, via the A bus to ALU Ybus to A write port 
path. The D bus path to the sequencer is valuable for 
storing and retrieving the state information in the se- 
quencer when interrupts, traps, or context switches 
occur. 

Control Decode 

This section of logic expands encoded microcode fields 
into individual control lines used throughout the system. 

Interrupt Logic 

This circuit monitors interrupt and trap conditions such as 
parity errors and breakpoints. When an intermpt condi- 
tion is detected, an interrupt request to the sequencer is 
made and an interrupt address vector generated. 
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Sequencer 

The sequencer is an address multiplexer with an on-chip 
address incrementer and stack. It selects the address for 
each microinstruction word read from the control store. 
The address selected depends on the instruction to the 
sequencer and on the state of test conditions. The 
sequencer can select addresses from the branch field of 
the control pipeline register, the macro opcode map, the 
internal stack, the increment of the last microinstruction 
address, or one of four status condition driven multi-way 
branch inputs. 

Macro Opcode Support 

Macro vs. Micro Programs: A microprogram is the 
definition for the state of the primary system control 
signals during each system clock cycle. Each word of 
microcode usually has a large number of bits so that 
many parallel operations may be controlled simultane- 
ously. Each microcode word must deal with the intricate 
details of system operation. The writing of microcode is 
a slow tedious process that must take into account every 
facet of system operation in order to provide the most 
efficient use of system resources. 

The advantage of microcode is that, very often, different 
system operations can be overlapped (done in parallel) 
since there is parallel control over all the system re- 
sources. 

A "macroprogram" is a series of microcode subroutine 
calls. Each macroinstruction has an opcode field that is 
simply a value that can be translated into the starting 
address of a microcode subroutine within the system 
microprogram. The macroinstruction may include para- 
meters that are passed to the microprogram. These 
parameters might be register addresses, loop counter 
values, immediate data, or memory addresses. 

The advantage of a macroprogram is that the instmctions 
are very simple and require relatively few bits to define as 
compared to a microcode word. The macroinstructions 
are simpler because all the details of system operation 
are specified by the underlying microcode instructions. 
The simpler instructions allow macroprograms to be 
written much more quickly than microprograms. There- 
fore, once a set of microcode subroutines are developed 
to perform the most often needed system operations, a 
wide variety of macroprogram applications can be 
quickly written! Macroinstructions remove the system 
programmer's concern over every detail of system 
operation. 

The disadvantage of a macroprogram is that each in- 
struction must be fetched from memory and decoded 
(translated to a microcode subroutine address) before 



each microcode subroutine is executed. When each 
subroutine execution is long compared to the overhead 
of fetching and decoding the macroinstruction, the 
macroprogram will tun nearly as fast as an equivalent 
microprogram with the advantage being a much easier 
programming task. When the microcode subroutines are 
short compared to the macroinstruction overhead, the 
system speed can drop significantly. 

So, if macroprogramming concepts are used carefully, a 
macroprogrammed approach to system design can yield 
a significant improvement in the ease of system use 
without a large decline in system performance. 

For that reason, the Am29300 demonstration system 
includes the features described below, which allow a 
macroprogrammed approach. These features are in- 
tended to show how basic macroprogramming can be 
Implemented. 

Macro Opcode Register: When macro-instaictions are 
executed, the instructions are addressed in the memory 
via the A read port of the register file in the same way as 
described earlier for data. The selected instruction is 
read from the memory via the memory data bus and 
written into the macro opcode register. The instruction 
can also be writte n into the register file via the B write port 
in the same cycle (which may be useful for instructions 
that contain immediate operands that would be used by 
the data section). 

Macro Opcode Map RAM: The macro opcode map 
RAM is made of three Am9l50 high speed SRAMs. The 
opcode portion of the macro opcode register addresses 
a microcode entry point table in the map RAM. This entry 
point is then used by the Am29331 sequencer as a 
branch address to the microcode routine that performs 
the function required by the macroinstmdion. 

Macro Operands: The operand portion of the macro 
opcode register is loaded into the macro operand count- 
ers. The macroinstruction operands allow the direct 
specification of register file addresses, ALU shift values, 
or ALU field masks to be used by the microcode routines. 



Register File Address, Position, and Width 
Multiplexers: Register file addresses are passed to the 
register file via the register file address multiplexer. Po- 
sition and width information for shift values and field 
masks are passed to the ALU via the position and width 
multiplexers. These multiplexers allow either the microc- 
ode or the macroinstructions to control the register file 
and ALU. 
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SECTION 2 

Nomenclature 



Throughout the remaining figuresinthis application note, 
some naming and drawing conventions are used as 
noted below. 

All signal names are written as single word identifiers with 
underlines used to provide visual space between sec- 
tions of a multi-word identifier. 

Signals that are active low have names that end with an 
asterisk. In some of this document's programmable logic 
definition files, this convention is not allowed. In those 
situations, the active low signal names will begin with an 
exclamation point or end with an underline character. 

Clock and qualified clock signals have names that begin 
with CLK_. 

Groups of signals that form buses are shown as single 
lines with an associated numberthat indicates how many 
lines are involved. Bus lines are drawn with 45 degree 
turns and intersections instead of the usual right angle 
turns and intersections used with individual signal lines, 
in order to highlight buses visually. Major data highways 
such as the A_BUS, B_BUS, and Y_BUS have signal 
names that endin_BUS. The lines of abus are numbered 
from least significant to most significant with the least 
significant identified as line zero (0). Where a subset of 
the lines in a bus is shown, the bus signal name will be 
followed by parentheses containing numbers that show 
the range of lines in use. The numbers of a continuous 
range are separated by a colon (:), non-contiguously 
numbered lines are separated by a comnria (,). Where 
lines of a bus are split out to show the specific connection 
of bus lines in a circuit, a small numberthat indicates the 
line number within the bus will be shown near each line 
that is split off. 

Four major buses in the system share a common struc- 
ture. The A_BUS, B_BUS, Y_BUS, and MD_BUS all 
have the same layout. Each bus carries a 36-bit data 
word, which is arranged as four 8-bit bytes, each byte 
having its own parity bit. Byte zero (least significant) is 



locatedinbits 0:7; bit32isthe parity bit forbyte zero. Byte 
one is in bits 8:15 with its parity in bit 33. Byte two is in bits 
1 6:23 with parity in bit 34. Byte three is in bits 24:31 with 
parity in bit 35. 

Signals that come directly from the microcode memory 
pipeline register have signal names that begin with "P_". 

Ground symbols (zero volt points) are drawn as down- 
ward pointing triangles, or the signal name GND is used. 

Points tied to +5 volts are labeled with the signal name 

^cc- 

Components are shown with pin numbers immediately 
outside the rectangle that defines the component. 
Component-specific signal names related to component 
pins may be shown immediately inside the component 
rectangle. Where there are several components shown 
on a page with very similar connections, only one of the 
components will have pin numbers and signal names 
shown. The remaining components on the page are 
wired in the same manner. 

Each component is assigned and labeled with a "U 
number" that uniquely identifies the component. This 
helps identify specific components for discussion and 
separates identical type devices in the system compo- 
nent list. 

Because this demonstration system is complex by na- 
ture, it must be illustrated with many figures, each focus- 
ing on a different portion of the overall system. In orderto 
show the signal interconnections between all parts of the 
system, each signal that leaves or enters a figure is given 
a name. Often the names are abbreviations in order to 
save space in the figures. Each name shows a relation- 
ship to the signal's use. Whereverthe same signal name 
appears in different figures, a connection between the 
figures is defined. To help in identifying all the figures to 
which a signal travels, there is a signal-to-figure cross 
reference listing In Appendix B. 
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SECTION 3 

Data Section Description 



REGISTER FILE 



Two Am29334 register files are used in tandem to pro- 
vide a 64-register by 36-bit wide file. Tliis allows the 
storage of 32-bit data plus parity (1 parity bit/byte). Each 
Am29334 contains 64 registers that are 1 8 bits wide; see 
Figure 3-1 . 

An Am29334 register file can both read and write data in 
the same cycle, but it does not perform the read and write 
simultaneously. The read must be performed during part 
of the system cycle and the write during another part of 
the cycle. Since read data is needed by the ALU and 
multipliers as early in the cycle as possible and, since 
data values to be written are only available later in the 
cycle, the reading of data is done in the first half of the 
cycle and the writing done in the second half of the cycle. 
A convenient way to separate the two parts of the cycle 
is to use the system clock signal to control the internal 
address mux and write enable. 

As connected in Figure 3-1 , the read port latch enables 
(LEA and LEB) and write port common enables (WEAC* 
and WEBC*) are tied to the data section clock line 
{CLK_D). This causes read data to be accessed while 
CLK_D is high and read data to be latched when CLK_D 
is low. Data is written when CLK_D is low if the port write 
enables are active (WEAL* and WEAH*, or WEBL* and 
WEBH*). The high and low byte write enables for each 
port are tied together since only full 36-bit word writes will 
be done in this system. 

The various read and write addresses are provided from 
the register file address multiplexers, which will be cov- 
ered later. 

The output enable (P_OEA*) and write enables 
(P_WEA* and P_WEB*) come directly from the microc- 
ode pipeline register. 

ARITHMETIC LOGIC UNIT 

Am2g332 

The Am29332 provides a 64-bit funnel (barrel) shifter, 
32-bit mask generator, and 32-bit ALU. The ALU can 
perform binary and BCD add or subtract, multi-cycle 
multiply or divide, and logical operations. This single, 
highly-integrated chip provides the complete function of 
the ALU block In this system. The only added component 
is an external register used to maintain status bits for the 
macroprogram separate from status information used by 
the micro program. The ALU Is shown in Figure 3-2. 



Most of the control lines come directly from the microc- 
ode control pipeline register. 

The ALU output enable (ALU_OE*) is decoded from the 
control pipeline register. 

The POSITION and WIDTH signals come from the posi- 
tion and width multiplexers. These multiplexers select 
the position and width values from either the microcode 
pipeline or the macroinstruction in the macro opcode 

register. 

The slave mode input is tied to ground since there will be 
no use of the slave mode comparisons in this system. 

The HOLD input is used as an enable control over the 
clocking of the internal micro status register and Q 
register during times the ALU is not in use. Because the 
ALU, FPP, and PIVI share the same data source and 
destination buses (A_BUS, B_BUS, and Y_BUS), they 
generally cannot be used simultaneously due to bus 
contention. In recognition of this, the control fields for the 
ALU , FPP, and PM have been overlapped in the microc- 
ode to minimize the required width of each microcode 
word. This means that at certain times the control lines to 
the ALU will be meaningless to the ALU because the 
values on the lines are determined by the needs of the 
FPP or PIVI. Therefore, unless the hold input is used to 
prevent clocking of the status and Q register duing these 
times, the ALU status could be lost whenever the FPP or 
PM are in use. 

Note, however, that the hold input is not used as the 
general means to prevent clocking of the ALU registers 
when the whole system is halted (e.g., during single step 
mode). The data clock (CLK_D) that is distributed 
throughout the data section of the CPU is aqualif led clock 
and will be used to control the state change of all registers 
in the data section, Including those in the ALU at times 
when the whole system is halted. 

Macro Status Register 

There are two levels of status information that the pro- 
grammer of a microprogrammed system musttrackif that 
system executes macroinstmctions. These are referred 
to as the micro and macro status. The micro status of the 
system is updated at the end of each microcode step and 
is part of the system state. The macro status is part of the 
macroprogram state as reflected at the end of each 
macro step. Since many microinstructions may be exe- 
cuted to perfomi the function defined by a given macro- 
instruction, the macro status reflects the machine state 
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from the macroprogram viewpoint. The macro status 
may be carried across many microinstruction cycies 
without change. This requires a separate register to 
containthe macro status Independent of the micro status. 
The Am29332 does not have an internal macro status 
register so one must be provided externally. The loading 
of the macro status register and the use of the macro 
status information by the microprogram must be con- 
trolled by microcode. The Am29332 does provide an on- 
board multiplexer to select between the micro and 
macro status Inputs. Only the carry and link values are 
used directly by the Am29332 since these are the only 
status values normally used to modify data values. The 
macro stat us for the zero, sign, and overflow flags can 
be used by the sequencer as test conditions for branch 
instructions. 

The register used for holding macro status Is an 
Am29818-1 . The register Is loaded (clocked) by a quali- 
fied clock called CLK_MAC_STAT. This clock is qualified 
by the load macro status bit in the control pipeline 
register. The Am29818-1 Is also used to provide a 
diagnostic ability to read and load the macro status 
register through the use of an internal serial shadow 
register (SSR). 



FLOATING POINT PROCESSOR 

Am29325 

The Am29325 Floating Point Processor (FPP) performs 
32-bit floating point multiplication, addition, or subtrac- 
tion in a single cycle. Floating point division can be done 
in seven cycles using the Newton-Raphson method. The 
FPP is shown in Figure 3-3. 

All the control lines for the FPP are driven directly by the 
microcode pipeline register with the exception of the FPP 
output enable and the register flow-through enables. 
Those signals are decoded from the data path select field 
of the microcode pipeline register. The output enable 
decode is done by the AmPAL22V10 in Figure 3-3. The 
register flow through enable decode is done by the 
control decode logic which is described later. 

It should be noted that the Am29325 Is not a full fledged 
member of the Am29300 family. It is different from the 
other Am29300 members with regard to three key char- 
acteristics: It is slower, does no data bus parity checking 
or generation, and has no slave mode capability. 

The Am29325 flow through calculation time is 1 00 to 
125 ns rather than the 42 or 70 nsfortheALUorPM 
(the current PM is at 1 20 ns, but the fastest version will 
be at 70 ns). This requires that whenever the FPP is 
used, the system clock cycle must be extended to allow 



for the slower propagation time. This extended clock 
timing is covered later in more detail. 

The lack of parity checking is not much of a problem for 
the rest of the system since it only affects the data 
Integrity of information going through the FPP. The lack 
of parity generation isn't a problem as long as only the 
FPP is working on the data. The problem starts when 
floating point data is moved back to memory or is con- 
verted to integer values for use by the ALU. 

If data from the FPP is read by the ALU or PM, parity 
errors will be detected and a system Interrupt may 
result. That problem can be avoided if the system has 
kept track of which data resulted from FPP calculations 
and if the parity errors are Ignored when that data is 
read. But if FPP data results are moved directly to the 
memory and then on to the host system, the parity errors 
will eventually be found. 

So some means of adding parity generation to the FPP 
should be provided. One way is to add four 8-bit parity 
generator chips to the FPP output bus. This consumes 
power and boardspace while providing a benefit only 
when FPP data is moved directly through the register file 
to the memory. A better way is to use the parity genera- 
tors already available in the Am29332 by requiring that 
FPP data be passed through the ALU before being 
moved to the merrraty. Even though the data may not be 
modified by the ALU, correct parity will be generated on 
the ALU output. 

With the use of a little trick, there Is a way to provide parity 
checking on the FPP data Inputs. To do this, one of the 
data path select codes Is used to control the output 
enables of both the ALU and FPP. This code (P_DSP = 
11) causes the FPP outputs to be disabled and the ALU 
outputs enabled, even though the data path selected is 
the FPP. By tuming on the ALU outputs, the ALU parity 
error output will also be enabled and any parity error on 
the A_BUS or B_BUS will be reported. At the same time, 
the control microcode forthe FPP is still valid and may be 
used to load registers with the data present on the 
A_BUS and B_BUS. Of course the register file should not 
be loaded from the Y_BUS in the cycle where this 
scheme is used because the ALU is driving nonsense 
information onto the Y_BUS. Enabling the ALU outputs 
is only a trick used to make the ALU parity checker results 
available for this scheme. Note that the ALU hold input 
remains active even though the ALU output enable is 
active. This prevents any state change in the ALU when 
the FPP is the data path actually in use. 

Finally, the issue of no slave error checking is unimpor- 
tant, since the slave mode Is not used in this system. 
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FPP External Status Register 

Status Pipeline Issue 

The FPP status flags appear at the status outputs along 
with data at the Y outputs. If the FPP "F" register Is made 
transparent, the status flag register Is also transparent. If 
the F register Is clocked, so Is the status register. In this 
demonstration system this presents a problem. 

Normally, status conditions from the data section are 
registered before being used by the control section. This 
maintains the pipelined, parallel operation of the control 
and data sections. The control section bases its testing 
on registered status from the last data section cycle 
rather than being forced to wait for status results of the 
current execution cycle before determining the next 
microinstruction to execute. 

To provide the same system for the FPP requires an 
external status register for cycles in which the F register 
is transparent to allow results to pass directly to the 
register file. In that situation the status flags are not 
registered by the FPP and thus, without an external 
register, there is no place to pipeline the status for the 
control section. 



Multiple Status Flag Test Issue 

Several of the FPP status flags signal events of equal 
Importance such that it would be a convenience to be 
able to test multiple flags in a single cycle rather than 
basing branches on only one flag at a time. 

A simple way to test multiple conditions at one time is to 
execute a multi-way branch based on the bits being 
tested. In the case of the FPP there are six flags, too 
many for a single multi-way branch which can be based 
on only four bits. A solution is to OR some of the flags 
together as one of the multi-way branch bits and use the 
remaining bits directly as part of the multi-way branch 
address. In that way, one multi-way branch can test all 
six flags. 

When testing the status, if no flags are active, no abnor- 
mal condition exists, and the zero value destination of the 
multi-way branch continues. If one or more of the direct 
flags is active, the multi-way branch goes straight to a 
routine to handle the problem. If one of the ORed flags is 
active, the multi-way branch destination instruction can 
either ignore the flags or take a second multi-way branch 
that is based on direct inputs of the flags that were ORed 
in the first multi-way branch (an advantage of having 
more than one source for multi-way branch conditions). 
The second multi-way branch determines which of the 
ORed flags was active in the first multi-way branch. 



FPP Status Register Implementation 

An AmPAL22V10 Programmable Array Logic device is 
used to register the FPP status flags and perform the OR 
of some of the flags. 

This external status register loads new status only as the 
resu It of cycles in which the FPP is the selected data path 
during an instruction execution. When the FPP "F" regis- 
ter is in transparent mode, the external status register is 
loaded with the flags at the end of an FPP cycle. This 
results in a one level deep pipeline on status in the same 
way that ALU status is pipelined one level internal to the 
ALU . When the F register is in clocked mode, the external 
status register will load in the cycle following an FPP 
cycle. This will capture the data that is loaded into the 
FPP on chip status register at the end of the FPP cycle. 
This causes the status to be double pipelined for cycles 
in which the F register is clocked. 

The multi-way branch outpulsforthe first level branch are 
the following flags: Overflow, Underflow, Invalid , and the 
OR of the Inexact, OR, NAN, and Zero flags. The multi- 
way branch outputs for the second level branch are: 
Inexact, NAN, Zero, and Ground. 

These groups of four bits are substituted for the least 
significant four bits of a branch address to act as a multi- 
way branch. 

Inadditionto the multi-way branch testforflags, an added 
output of the status PAL ORs together the Overflow, 
Underflow, and Invalid flags foruse as an interrupt signal 
to the system interrupt controller, thus giving one addi- 
tional way to monitor the FPP error flags. Using the 
Interrupt approach eliminates the need to follow floating 
point operations with multi-way branches in order to test 
for error conditions. Execution of Instructions can pro- 
ceed, assuming no major problems exist in an FPP cycle. 
If one of the above mentioned error flags is active, the 
resulting Interrupt will deal with the error. 

One last element of the status PAL is that It acts as part 
of the system control decode by decoding the data path 
select bits of the control pipeline to enable the FPP output 
when the FPP is the selected data path. 

The logic definition file for the status PAL is listed in 
Appendix C. 



Seed Look-Up Table 

The Newton-Raphson division algorithm does a division 
of A by B by finding the inverse of B (i.e., 1/B) and 
performing a multiply against A. This scheme works with 
the Am29325 since finding the inverse of B requires only 
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a series of multiplies and subtracts which the Am29325 
can do in single cycles. But, these multiplies and sub- 
tracts are performed only to refine the accuracy of a 
precalculated seed value (a rough approximation of the 
inverse of B). So atable of seed values must be available 
to support division with the Am29325. 

This seed table is stored in PROM memory extemalto the 
FPP. The B variable is used to address the seed table, 
and the resulting seed value Is fed into the FPP to be 
refined. 

Placing the seed table in the path to one of the FPP inputs 
normally requires a 32-bit multiplexer to select between 
the PROM and the direct input bus for loading normal 
operands in multiply, add, and subtract operations. Build- 
ing this multiplexer would require at least six hex-2-to-1 
multiplexerchips. The PROM and multiplexerwouldalso 
increase the propagation time needed to load the FPP, 
thereby requiring the cycle timing to be extended even 
more than is already required by the FPP. 
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The implementation of the seed table in this system has 
been modified to save chips and cycle length. Instead of 
placing the seed table between the A_BUS and the FPP, 
it is placed to the side as an appendage of the A_BUS 
(see Figure 3-3). The inputs and outputs of the table are 
tied together and to the A_BUS. The internal structure of 
the table is shown in Figure 3-4. It contains three 
PROMs,eachof which isfollowed by athree-state output 
register (the Am27S25 has an internal register). In this 
arrangement the PROMs can be accessed by the value 
present on the A_BUS in one cycle and the resulting seed 
loaded into the registers. In the following cycle the 
registers can drive the A_BLIS with the seed value. This 
scheme requires three fewer chips and no extension to 
the FPP cycle time. It is true that two cycles are now 
required to load the seed value but the cycle used to 
access the seed table can be combined with the 
operation of checking for a zero divisor. This operation is 
generally done during the setup for a divide. 
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Figure 3-4. Floating Point Block Seed Look-Up Table ~ Data Flow Diagram 
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The detailed connections of tlie seed table are shown in 
Figure 3-5. The Am27S25 contains the seed values for 
the exponent and the two Am27S43s contain the seed for 
the fraction. The seed table output enable (SEED_OE*) 
signal is a decoded output of the microcode control 
pipeline register. The output register of the seed look-up 
table is clocked by the data section clock. 

PARALLEL MULTIPLIER 

The entire Parallel Multiplier (PM) block's function is 
provided by the single chip Am29C323 Parallel Multi- 
plier. This chip performs 32-bit, 64-bit, 96-bit, and 128-bit 
integer multiplies. It also can perform multiply accumu- 
late using an internal 67-bit accumulator. The PM Is 
shown in Figure 3-6. 



Most of the control signals come directly from the control 
pipeline register. The Parallel Multiplier output enable 
(PM_OE*) is decoded from the data path select field of 
the microcode pipeline register. The enable and flow 
through controls for the instruction register (ENI* and 
FTl) are tied respectively to GND and VCC to allow 
instructions to flow directly from the microcode pipeline 
register to the multiplier, since the microcode pipeline 
register already provides the one level of pipeline re- 
quired in the system. The flow through enable on the 
product register is enabled only when the PM data path 
is selected via the control decode logic. 
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SECTION 4 

Memory and External System Interface 



The memory block and external system interface are 
discussed together in this chapter because of the tight 
interconnection between these areas. It Is helpful to view 
the two blocks together in orderto understand the shared 
use these blocks make of the memory address bus 
(MA_BUS) and the memory data bus (MD_BUS). Rg- 
ure 4-1 shows a block diagram of the data and address 
paths used in these sections. 

One thing to note is that both the memory and the 
external interface are not elaborate in design. Essentially 
the external I/O section of this system is just a second 
port on the system memory. This system does little more 
than provide a simple arbitration scheme on access to 
the memory that allows an externally supplied DMA 
device to load and retrieve data from the memory. Event 



or interrupt signaling between the CPU and host system 
is limited toasingle pair of intermptsignals.onefrcm host 
to CPU, one from CPU to host. IWemory Itself is only a 
simple bank of static RAM with two address counters on 
the input that help speed up array calculation. 

The reason for this simple approach is that the design to 
the CPU using the Am29300 family of building blocks is 
the focus of this application note. Every reader who may 
find the Information in this application note useful will 
have different memory and I/O requirements to handle 
and will very likely design individual approachs to mem- 
ory and I/O. Therefore, only this simple approach is 
covered here so that more time can be spent discussing 
the CPU design. 
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EXTERNAL BUS INTERFACE CONTROL 

Host Access Definition 

A block diagram of the host interface controller and Its 
connection to the MA_BUS and MD_BUS buffers Is 
shown In Figure 4-2. 

The Am29300 demonstration system Is treated as a co- 
processor to some host system. It ultimately gets all of Its 
Instructions, data, and control from the external host 
system. To provide communication with the host using a 
minimum of design effort and special hardware, only two 
portals Into the Am29300 system are allowed. 

One portal is the Am29300 memory, which is treated as 
a dual port memory with all words directly mapped Into 
the host bus address space. With this, the host has 
complete access to macroinstructions and data going 
Into and out of the system. 

The second port is a serial diagnostics shift chain that 
runs through l<ey control registers of the system. This 
serial pathway gives access to loading and reading the 
microcode writable control store, to the control pipeline 
register, to loading and reading the macro opcode map 
RAM, to the macro opcode register, to the macro status 
register, and to the Interrupt base address register. 



Through this serial port, the microinstructions are loaded 
by the host before program execution begins. Also, the 
system clocks can be controlled by the host to allow 
diagnostics and code debugging via single stepping and 
breakpoints. 

These portals are controlled by a stale machine that Is 
separate from the Am29300 system. The state machine 
is referred to as the host interface controller, it constantly 
monitors the external host address bus. When the host 
presents an address that matches a preset address on 
the Am29300 system board, the host Interface controller 
is selected to perform one of several interface functions. 

Any function requested by the host takes priority over 
anything that the Am29300 CPU Is doing. The host 
always gains control of the memory address and data 
buses as soon as the CPU clocks can be stopped and the 
CPU to memory bus buffers disabled. 

The function performed is dependent on the address 
used, thus the commands from the host to the interface 
controller are memory mapped . A 24-blt address from the 
host Is assumed for this design. The 6 most significant 
bits (23:18) of the address are matched to the Am29300 
systemboard address to selectthe host interface control- 
ler. The next two most significant bits (17:1 6) are used to 
select a command mode. The 3 least significant bits (2:0) 
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are used to select a specific command function within two 
of thie command modes. 

Host Interface Block Diagram 

Tlie 6 most significant bits of ttie host address are 
checl<ed by the address recognition blocl<: if the address 
matches the board address, then the match signal is fed 
into the input of a synchronizing register. Also fed into this 
register are: the external bus write enable line 
(EXT_WEN*); the external address bits 17, 16, 2:0 
[EXT_ADD(1 7,1 6,2:0)]; and the host system reset line. 

The synchronizing register is clocked by a free-running 
version of the Am29300 system clock. The register used 
has special meta-stable hardened circuitry that prevents 
the outputs from oscillating, regardless of the timing 
relationship of input data to clock. This register allows the 
entire Am29300 system to run asynchronously with 
regard to the host system clock. All the interaction be- 
tween the host system and the Am29300 system is 
synchronized to the Am29300 system clock by the regis- 
ter. Each command to the host interface controller is thus 
presented at the output of this register in synchronization 
with the host interface controller clock. 



The heart of the host interface is an Am29PL141 Fuse 
Programmable Controller, it Is a microprogrammed 
sequencer with on-chip microcode memory and pipeline 
register. This sequencer implements the state machine 
functions needed to control the interaction between the 
host and the Am29300 system. Used with the 
Am29PL141 is an Am22V10 PAL. This PAL collects 
together some glue logic functions: an intermpt signal 
latch, a multiplexer, and some encoding logic, all of which 
are described later. 

The Am29PL141 provides control signals to the clock 
gating and distribution section of the Am29300 system . It 
also controls the enabling of all the buffers and transceiv- 
ers that connect with the MA_BUS and MD_BUS. The 
controller acts as a 'traffic cop"that allows only one driver 
on those buses at a time to prevent contention. The 
controller also manages the loading, reading, and shift- 
ing of the Serial Shadow Register diagnostic chain. 

The Serial Shadow Register (SSR) diagnostics port is a 
32-bit-wide parallel read and write register that also 
functions as a shift register. Data to be read or written to 
the SSR diagnostic chain is loaded or read via this port. 
The port is connected to the host via the MD_BUS. The 
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portisbulltfromfourAm298l8-l SSRdiagnosticpipeline 
registers. Tliese registers, like all the registers in the 
diagnostics chain in this system, contain one normal 
parallel input and output pipeline register that is backed- 
up or "shadowed" by a second parallel input and output 
register that also acts as a serial shift register. The 
pipeline register can be loaded from the shadow register 
and the shadow register can be loaded from the outputs 
of the pipeline register. This gives the ability to move data 
into or out of the pipeline register via the shadow register 
Data in the shadow register can be serially shifted to 
other similar registers in the system. By connecting all the 
diagnostic serial shadow registers together in a serial 
chain, data can be moved serially through a large number 
of key registers in the system using very few wires. 

The SSR diagnostics port is just an extra section of the 
diagnostics chain that mns throughout the Am29300 
system. This extra section is connected to the MD_BUS 
to serve as a parallel input and output port that gives 
access to the serial shadow register chain. 



A slightly more detailed view of the Host interface Con- 
troller is shown in Figures 4-3 and 4-4. 

Event Signals 

The host and the Am29300 system need to be able to 
signal each other when important events occur, such as 
the transfer of ownership over sections of the dual port 
memory. To allow this, a simple interrupt setting and 
clearing scheme is provided. 

The host inten-upts the Am29300 system with a com- 
mand to the host interface controller. The controller in 
turn sets an interrupt flag in the Am29300 system inter- 
mpt controller. The interrupt is cleared when the 
Am29300 services its interrupt controller. 

The Am29300 interrupts the host by using a microcode 
bitto set a latch thatdrives an interrupt lineontheexternal 
bus. The interrupt Is cleared whenever the host does an 
operation on the SSR port. The interrupt latch Is imple- 
mented in the AmPAL22V1 0, as shown in Figure 4-4. 
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Memory Enable 

The Am29300 system memory can be enabled by 
either the Am29300 microcode or by the host interface 
controller. A simple multiplexer is needed to direct the 
correct control signal to the memory enable input. This 
logic is also implemented in the AmPAL22Vl shown 
in Figure 4-4. 

AmPAL22V10 Support Logic 

Figure 4-4 shows the logic for the AmPAL22V1 that 
integrates the interrupt signal latch, SDI multiplexer, and 
memory enable logic. The logic equation definition file for 
this PAL is listed In Appendix D. 

SSR Diagnostics 

SSR Shift Path 

Figure 4-5 shows a block diagram of how the serial 
shadow registers in the system are linked together and 
how they relate to the macro opcode map RAM, se- 



quencer, and microcode control store. l\*iost of these 
registers are also depicted in other Figures throughout 
this application note in their roles as parallel input and 
output pipeline registers. Figure 4-5 emphasizes the 
serial in and out and control connections of the shadow 
registers also contained in these registers. 

The SSR diagnostics port is shown as the starting and 
ending point for the entire shift chain (or loop as seen 
here). Data to be loaded into the SSR loop is parallel 
loaded into this register fromthe MD_BUS viathe bidirec- 
tional outputs of the registers in this port (note: the 
shadow register in the Am2981 8-1 gets its input from the 
output pins of the Am29818-1 pipeline register). 

Data loaded Into this shadow register is then shifted into 
one of two branches of the SSR loop. One branch flows 
through the Writable Control Store (WCS) port and the 
microcode control store pipeline shadow registers. The 
WCS port is used to address the microcode control store 
or to receive (load) datafrom (to) the macro opcode map 
RAM. The microcode control store shadow register is 
used to write data into the microcode writable control 
store or to read the contents of the control pipeline 



Am29818 

Interrupt 
BASE Address 



TT 



SDLSSR_MUX 

MD_BUS (31 :0) 

SSR_BUS_EN • 

DCLK SSR 



MUX 
1 

2 
3 

1 

T 



MODE 
WCS_WR ■ 
DCLK WCS 



U18:U21 SSR Port 

Am29S18 



^ 



J 



i-t 



Macro Opcode Register 
Am29818 



Status Register 
Am29818 



Opcode Map 

RAM 

Am9150 



Sequencer 
Am29331 



WCS Port 
Am29818 

i i Iv 



Control Pipeline Register 
Am29818 



Figure 4-5. Serial Diagnostics Shift Path 



6-35 



CHAPTER 6 
Articles/Application Notes 



register. The second branch flows through the macro 
opcode, macro status, and the interrupt base address 
registers. The macro opcode register Is used in part to 
address the macro opcode map RAM . 

These branches are separate because it helps to shorten 
the shift chain length by using branches and because the 
shift chain clock to the writable control store and WCS 
port must be separate from the shift clocks to the rest of 
the diagnostics chain. The shift clocks must be separate 
because of the way the writable control store is loaded. 

The data outputs of the contro I store are connected to the 
inputs of the pipeline register as required for normal use 
in the system. To write the memory, the inputs must be 
driven with the data to be written, turning the input pins 
into outputs. In the Writable Control Store (WCS) pipeline 
register this is fine, since the memory outputs are dis- 
abled during the write. 

If other diagnostic registers in the system were tied to the 
same shift clock and mode control lines as the WCS 
pipeline, there could be a problem every time the WCS is 
written. The other diagnostic registers not involved in the 
WCS write would see the same control signals as the 
WCS registers and would drive their input pins. Depend- 
ing on what the other registers were connected to, this 
situation could cause serious contention problems 
through the system. 

For this reason, the SSR used to load WCS is treated 
separately from other SSR registers in the system. It is 
worth noting that the only control signal that need be 
separate is the shift clock. The mode and serial path may 
be shared with all SSR in the system. Putting the SSR 
into WCS loading mode, requires the shift clock to load an 
internal mode flip flop. If the shift clock is active only to the 
SSR used for WCS when the MODE and Serial Data In 
(SDI) signals are set high, only the WCS SSR will go into 
the input pin driving mode. 

The end of each branch in the SSR loop returns to a 
multiplexer at the serial data input (SDI) of the SSR 
diagnostics port. This multiplexer allows the selection of 
the shifted branch into the port when the SSR loop is 
being read ratherthan written. It also allows the SDI value 
to be forced when the MODE signal is high. When the 
MODE signal is high, all the SSRs in the system pass 



their SDI directly to their Serial Data Output (SDO). This 
causes the SDI value forced at the input of the SSR port 
to be passed directly to all SSRs in the system (note: 
significant propagation time from SDI to SDO for each 
SSR is involved). In this way the forced value of SDI 
becomes an additional control signal to all the SSRs in 
the system. The function of this multiplexer is integrated 
into the AmPAL22V1 as shown in Figure 4-4. 

SSR Reading and Writing 

To read the contents of the pipeline registers in the 
Am29300 system, the host must first send a command to 
load the SSR throughout the system from the pipeline 
registers. Then the host must shift the contents of the 
SSR into the SSR port register (up to 32 bits at a time). 
The host then performs a read of the SSR port. The host 
then repeats the shifting-and-reading process until the 
entire SSR chain has been read. 

To write the system pipeline registers, the host reverses 
the above procedure. Data is first written into the SSR 
port. Then the SSR chain is shifted to move data into 
position. The SSR port loading and SSR chain shifting go 
on until the section of the SSR chain desired is filled. 
Finally a pipeline load command is issued by the host to 
load the contents of the SSR into the pipeline registers. 

To write the macro opcode map RAM and the microcode 
writable control store (note: these are treated as a single 
WCS and must be written together), an address for the 
map RAM is first loaded into the macro opcode pipeline 
register via the method described above. Then the ad- 
dress forthe microcode WCS is loaded into the WCS port 
pipeline register. Next, the data to be written into the map 
RAM and into the microcode WCS is shifted into the WCS 
port SSR and WCS SSR. A load WCS command is then 
given which performs the actual write of data into the 
memories. During the write operation the output of the 
WCS port is enabled and the Am29331 sequenceroutput 
is disabled (via its HOLD pin). 

The only trick involved in the SSR Reading and Writing is 
knowing how much to shift the SSR during each read or 
write. The problem is that the SSR chain length in this 
system (and in nearly every real system) is not an even 
multiple of the SSR port size. During the first (or last) shift 
operation of either the read or the write of pipeline 
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registers, it will be necessary to shift fewerthan the full 32 
bits of the SSR port. The number of bits to be shifted 
depends on the chain length. One thing to note is that the 
chain length will be in a multiple of 4 bits because 
diagnostic pipeline registers are currently available only 
in 4-bit and 8-bit devices. So, when a shift operation is 
commanded by the host, the number of nibbles (4-bit 
shifts) to be shifted must be indicated. 

A final note: during the shifting of the WCS SSR, the 
Am29300 system clocks must be halted. This is due to 
the fact that pipeline clock and shift clock to the Am91 51 
may not occur within 65 ns of each other. Since these 
clocks would occur within the above window in this 
system, the pipeline clock must not be active. 

Controller Description 

Function/Command Descriptions 

The following is a list of the address values for functions 
that the host interface will perform when addressed fay 
the host: 



Memory Access: Reading and writing of the Am29300 
system memory is done by selecting the address for the 
Am29300 system with address bits 16 and 17 equal to 
zero. The address for the specific word in memory is 
contained in address bits 0:15. The host interface con- 
troller, upon recognizing the host access, will stop the 
clocks to the Am29300 system and disable the CPU to 
MA_BUS and MD_BUS buffers. At the same time the 
external bus to MA_BUS and MD_BUS transceivers are 
enabled. This suspends the operation of the Am29300 
system and gives memory access to the external host. 
The write enable line on the external bus determines 
whether a read or write occurs. 

Note that by suspending the Am29300 system operation, 
the memory access is transparent to (or hidden from) the 
CPU. There is no action required on the part of the 
Am29300 microcode or interrupt control. 

Serial Diagnostics Port Access: This access is very 
similar to that of a memory access. The difference is that 
the SSR port register is being read or written instead of 
memory. 



ADDRESS BITS 



FUNCTION 



17 16 






X 


X 


X 


Am29300 Memory Access 


1 


X 


X 


X 


Serial Diagnostics Port Access 


1 











Illegal code 


1 








1 


Halt CPU 


1 





1 





Run CPU 


1 





1 


1 


Single Step CPU 


1 


1 








Single Step CPU Control Section 


1 


1 





1 


Single Step CPU Data Section 


1 


1 


1 





Interrupt CPU 


1 


1 


1 


1 


Reset CPU 













Illegal code 










1 


Load Pipeline Register 







1 





Load Macro Opcode Register 







1 


1 


Load Writable Control Store 




1 








Load Initialization Register 




1 





1 


Load Serial Shadow Register 




1 


1 





Shift WCS SSR Chain 




1 


1 


1 


Shift Macro Opcode SSR chain 
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Halt CPU: This command throws the Am29300 system 
clocks in to a continuous stop condition until the mode is 
cleared by the RUN CPU command ortemporarily over- 
riden by one of the single step commands. 

Run CPU: This command starts the Am29300 system 
clocks running. 

Single Step CPU: When the CPU is halted, this com- 
mand will cause all the system clocks to cycle once to 
advance the state of the CPU one step. Note that gated 
clocks will be active during this cycle only if their enables 
are active (i.e., gated clocksoperate as they would during 
a normal clock cycle; they are not forced to operate). 

This mode is useful during diagnostic operations to single 
step the machine between serial load and unload of the 
SSR diagnostics. 

Single Step CPU Control Section: This will step only 
the clocks In the control section of the CPU. The control 
pipeline, macro opcode, macro operand, status, se- 
quencer, and interrupt registers may be affected. 

This is useful for forcing the control section into a new 
state under the control of diagnostics, such as a forced 
branch to a new location in the microcode. This is done 
by first loading the control pipeline with an instruction to 
branch via the SSR diagnostics chain. The control sec- 
tion would then be single stepped to execute the branch. 
Note that during these operations, the data section Is not 
affected and no data is modified. 

Single Step CPU Data Section: This operation single 
steps the clocks only In the data section of the CPU. This 
may be useful for repetitive diagnostic operations involv- 
ing only the data section. 

Interrupt CPU : This command causes the host interface 
controller to set an inleraipt input to the Am29300 system 
interrupt controller. The interrupt controller in turn priori- 
tizes the interrupt and causes an interrupt to the CPU 
when that type of interrupt is enabled. 

Reset CPU : Th is will make the reset line to the Am29300 
system active and step all the ungated system clocks. 
The clocking is required by some parts of the system to 
affect reset state changes. 

Load Pipeline Register: This command will step only 
the clock to the control pipeline and WCS port for one 
cycle while forcing the pipeline registers to load data from 
the SSR chain. This is used to control the state of the 
pipeline through serial diagnostics. 



Load Macro Opcode Register: This steps only the clock 
to the macro opcode, macro operand, status, and Inter- 
rupt base address pipeline registers while forcing the 
registers to load from the SSR chain. 

Load Writable Control Store: This command initiates a 
series of clock cycles that cause data in the SSR chain to 
be loaded into the writable microcode control store and 
the macro opcode map RAM from the SSR chain. The 
address loaded is also specified in the SSR chain. 

Load initialization Register: Like the previous com- 
mand, this operation loads the writable microcode store. 
The difference is that only the WCS (Am9151) initialize 
registers are loaded from the SSR chain. 

Load Serial Shadow Register: This causes the con- 
tents of all diagnostic pipeline registers to be copied into 
the related SSR chain elements. This Is used to read the 
Am29300 system stale into the SSR chain so that it can 
be shifted out to the host. 

Shift WCS SSR Chain: This command shifts the con- 
tents of the SSR port register into the SSR diagnostics 
chain used forthe writable control store. It also brings the 
bits at the end of the WCS SSR chain into the SSR port 
register . This is the serial read and write operation of the 
WCS SSR chain (or loop). 

Shift Macro Opcode SSR Chain: This is the same as 
the previous command but it affects the SSR chain 
associated with the macro opcode, status, and interrupt 
base address registers. 

Illegal Code: Due to the way the host interface control- 
ler algorithm was Implemented, this command (address 
combination) is illegal. If it is used, it will lock up the host 
interface controller in an infinite loop. 

Access Timing 

The speed of interaction between the host and the 
Am29300 system is regulated by both the host and the 
host Interface controller. 

Once the Am29300 system is addressed by the host, the 
host interface controller holds the external bus by driving 
EXT_READY inactive. This continues until the host inter- 
face controller completes the command requested. The 
EXT_READY signal is then made active and held active 
until the host stops addressing the Am29300 system. At 
that time, the host interface controller recognizes that the 
host has completed the transaction and the 
EXT_READY line is again made Inactive. 
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In this fashion, either the host interface controller or the 
host can extend the length of the external bustransaction 
as required. The signal timing between the host and the 
host interface is treated as asynchronous. The timing of 
the host interface itself is synchronous with the Am29300 
internal clock cycle. 

An interaction diagram is shown below fora bustransac- 
tion between the host and the Am29300 system. The 
single-line dividers indicate one clock cycle of the 



Am29300 system. The double-line dividers Indicate one 
or more clocks as needed for synchronization or algo- 
rithm execution. 

The length of an external bus transaction can vary from 
about 6 Am29300 system clock cycles for a memory 
access, to about 80 clock cycles for an SSR shift 
operation. Regardless of the transaction type, the 
Am29300 system looks to the host like a slave bus 
peripheral. Sometimes, as in the case of the SSR shift 
operation, it is a rather slow peripheral. 



External Bus Activity 



Am29300 System Activity 



Address to Am29300 is 
active on the bus. 



CPU is active. 

CPU owns MA and MD bus. 



Address is clocked into 
the host interface 
controller synchronizing 
register. 



CPU is still active. 

CPU still owns internal bus. 

Host interface controller 

performs branch to command 

routine. 



External bus 
transceivers are enabled 
if needed. 



CPU clocks are stopped. 
CPU bus buffers are disabled. 
Host interface executes first 
instruction of command routine. 
READY may or may not be made 
active depending on routine. 



If READY is inactive, 
wait for host interface 
to complete algorithm 
and make READY active. 
CPU operation is still 
suspended. 



If READY is active, then 
wait for host to 
release external bus by 
stopping selection of 
the Am29300 system. 



External bus address 
no longer selects 
Am29300 system. 



CPU still suspended. 
Host interface waiting to 
see host release bus. 



Lack of external bus 
address is clocked Into 
host interface sync 
register. 



CPU still suspended. 

Host interface branches back 

to idle loop. 



External bus transceiver 
is disabled. 



CPU clocks are active. 

CPU has MA and MD bus access. 

Host interface waits in idle loop for next command. 
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Program Definition 

A detailed definition of the host interface controller's 
algorithm is contained in Appendix E. 

MEMORY 

Memory Components 

The memory device used to construct the 1 6K word x 36- 
bit memory is the Am99C1 65. This is a 1 6K x 4-bit CMOS 
static RAM memory. The 35 ns access time version is 
assumed in any timing estimates for the Am29300 
demonstration system. Nine memories are used as 
shown in Figure 4-6. 

The Am99C165 is used so that an additional output 
enable is available to help prevent bus contention with 
other buffers on the MD_BUS. The memory outputs are 
disabled whenever the memory write enable line is 
active. The write enable line is also used to control the 
direction of the external bus data transceiver and the 
enable on the CPU data buffer. The delay of the inverter 
on the output enable input to the memory has been 
matched by a buffer in each of the other bus drivers just 
noted. This is so that when a write operation is signalled, 
each bus driver receives its bus enable or disable signal 
at the same time as the memory. This overlaps the turn 
off lime of the memory outputs with the turn on time of the 
other bus drivers to minimize bus contention with the 
memory. 



The enable line to the memory is used to power down the 
memory when it is not being selected by the Am29300 
CPU. 

The write enable line to the memory is gated with the 
Am29300 system free-running clock. This keeps the 
write line high (inactive) until late in the cycle when all 
the control signals that feed into the memory enable 
have settled. This is important for cycles in which there 
Is a change of ownership on the memory address and 
data buses. The gating with clock ensures that unin- 
tended pulses on the write enable line that may occur 
early in the system cycle will not cause spurious writes in 
the memory. 

Addressing Scheme 

Description: With reference to Figure 4-1 , the memory 
address bus (I\/1A_BUS) is not only the address input to 
the memory, it is also a part of a 4 to 1 multiplexer. There 
are four address drivers tied to the l\/1A_BUS. They are: 
the A_BUS to MA_BUS buffer, the External Bus address 
to MA_BUS buffer, and the two memory address count- 
ers. Each of these sources has three-state output drivers 
and, by careful control of which source is allowed to drive 
the MA_BUS at any one time, the sources form the 4 to 
1 multiplexer. 

In this way the memory can be addressed directly by the 
A_BUS or the External Bus. The memory can also be 
addressed indirectly by the A_BUS via the memory 
address counters. 
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The memory address counters are loadable up/down 
counters that can sen/e as address pipeline registers, 
sequencers, or stack pointers independent of the CPU's 
data section. They allow sequential reads or writes to 
memory by the CPU without requiring the CPU to calcu- 
late an address on every read or write cycle. 

In fact, after loading a memoi7 address counter with an 
initial address, the CPU can perform sequential read 
cycles while at the same time continuing to use the data 
section for other calculations. This is possible because of 
the dual write port design of the CPU register file. The 
memory data is loaded into the registerfile via the B write 
port while calculation results on the Y_BUS are stored 
through the A write port. 

Two counters are provided to allowfor consecutive A and 
B operand data fetches from two separate arrays of data 
without the need to constantly reload the counter values. 
Each counter is built from two AmPAL22V1 Program- 
mable Array Logic (PAL) devices that act as two cas- 
caded 7-bit loadable up/down counters. The counters 
are connected as shown in Figure 4-7. The logic defini- 
tion file for the PALs is given in Appendix F. 

The two counters are only loaded from the A_BUS and 
not the External Bus, even though the connection of the 
counters to the MA_BUS would permit the latter. This is 
due to the difficulty in coordinating the use of the counters 



between the CPU and the External Bus. The counters are 
simply viewed as a resource of the CPU only. 

Why This Approach?: Why address the memory from 
the A_BUS? Doing so means that data in the memory is 
selected by an address previously stored in the register 
file. So one cycle must be used to calculate an address 
in the data section of the CPU, store the result in the 
register file, and take a second cycle to actually address 
the memory. Why not just take the address as it is 
calculated and feed it directly from the Y_BUS to the 
memory? 

First, the access time isbetterfromthe A_BUS than from 
the Y_BUS. The A_BUS address is valid 45 ns into a 
cycle which still leaves time to access a fast static RAM 
in the same time that data would normally flow from the 
A_BUS through the ALU and back to the registerfile. An 
address on the Y_BUS would not be valid until 87 ns 
into a cycle, which would require either that the memory 
access extend the cycle length significantly or that the 
address be pipelined into a memory address register and 
be used to address the memory in a second cycle. 

Second, since the register file can present two data 
words in one cycle it is possible to address the memory 
and provide write data in the same cycle ; the address and 
datagofromthe registerfile to the memory. If theY_BUS 
is used as the path to the memory in a write operation, a 
second cycle must be used to provide the write data. 
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Third, the above comments are trick answers. It the two 
approaches of A_BUS or Y_BUS as the memory address 
path are carefully examined it can be seen that it is really 
a situation of "six of one, or half a dozen of the other". 
Ultimately, in either case, a cycle is use to calculate the 
address and a second cycle is used to read or write the 
memory; there is only one data path in the system and 
only one calculation can occur in a cycle. Between the 
two approaches there are various ways to overlap other 
calculations with memory accesses to mal<e the best use 
of the system's time but either approach takes the same 
time. 

The real difference is that the A_BUS method is simpler 
from the microprogrammer's point of view. With the 
A_BUS method a memory read is done in one cycle and 
the resulting data is in the register file in the next cycle. 
With the Y_BUS approach there is a one cycle delay 
between a read access and the return of data, which 
requires that the microprogrammer "fill in the hole" in the 
microcode with other useful work to get the same system 
efficiency. So, as a designer's preference, the A_BUSfor 
memory address approach is used. 



CPU - Memory Buffers 

The address buffers from the A_BUS to the MA_BUS and 
the data buffers from the B_BUS to the MD_BUS are 
shown in Figure 4-8. The address and data buffers are 
built from Am29827 10-bit-wide high speed buffers. 

The address bus is 1 4-bits wide to address 1 6K words of 
36-bit-wide memory. But these bits are taken from bit 
positions 2:15 of the A_BUS. This leaves the two least 
significant bits of the A_BUS unused and therefore treats 
the address as being in terms of bytes with the address- 
ing restricted to four-byte (word) boundaries. This was 
done so that interface with an external host bus would be 
simpler. Many of the host systems with which this dem- 
onstration system could be mated use byte addressing. 
With the above address scheme, all the address line 
numbering is consistent between the host and CPU. In 
addition, if there were a future need to allow byte ad- 
dressing of the CPU memory, it would be possible with 
only a minor change to the address buffer wiring. Also, it 
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may be noted that the parity bits on the A_BUS have been 
ignored in the M A_BUS since there is no parity checl<ing 
implemented on the memory address. 

The data buffers are arranged as one buffer per byte of 
the B_BUS (with parity on each byte). Note that, sincelhe 
B_BUS provides only write data, and read data from the 
memory is received by the register file, only a unidirec- 
tional buffer is needed. 

Whenever the external bus interface does not have the 
memory buses in use, the CPU to memory buffers 
receive the CPU_BUS_EN* signal to enable the buffers. 
If the operation is a write, the CPU_WEN* signal is 
provided by the CPU. 

Note that the CPU_WEN* is routed through the address 
buffer twice and then to the data buffer to enable it on a 
write operation. This is done to help equalize the timing 
between this buffer and the output enable on the mem- 
ory. Note also that the address buffers have a second 
enable input that is controlled by the control pipeline bits 
that manage whether the memory address comes from 
the A_BUS or from one of the memory address counters. 



External System Buffers 

The address buffers from the External Bus to the 
MA_BUS and the data buffers from the External Bus to 
the MD_BUS are shown in Figure 4-9. The address bus 
is built from Am29827 10-bit-wide high speed buffers. 
These buffers are connected in exactly the same way as 
described above forthe C PU to memory address buffers. 

The data buffers are, however, different from the earlier 
circuit description. These buffers are Am29863 non- 
inverting 9-bit high speed transceivers. The transceivers 
allow datato be both read and written by the external bus. 

When the external host system addresses the Am29300 
CPU memory, the external bus interface controller halts 
the system clocks in the CPU and disconnects the CPU 
from the MA_BUS and MD_BUS by making 
CPU_BUS_EN* inactive. Then the external bus is con- 
nected to the memory by making EXT_BUS_EN* active 
to enable the external bus buffers. The external bus 
supplies a write enable if the operation will be a write. 
Note again that the write enable timing is equalized with 
that of the write enable to the memory. 
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SECTION 5 



Control Section Description 



MACRO OPCODE SUPPORT 
Macro Opcode Register 

In orderforthe control section of the CPU to make use of 
a macroinstmctlon, the instruction must be selecledfrom 
memory and loaded into a register that is accessible to 
the control section. 

This register Is called the macro opcode register. It is a 
32-bit register made fromfourAm29818-1 pipeline diag- 
nostic registers. This register is shown in Rgure 5-1 . 

The most significant 14 bits (bits 31:18) of the register 
output are used as the macro opcode. Bits 31 :22 are 
connected to the address inputs of the macro opcode 



map RAM. Bits 21:18 are connected to one of the 
Am29331 sequencer's multi-way branch inputs. These 
lower four bits may thus be used as an opcode modifier 
via a multi-way branch. 

Bits 1 7:0 are the instruction operand register addresses. 
These bits are divided into three 6-bit fields, one for each 
registerfile port. Bits 1 7:1 2 are used as the registerfile 'A' 
read port address. Bits 1 1 :6 are used as the 'B' read port 
address. Bits 5:0 are used as the register file 'A' write port 
address. These addresses are respectively referred to 
as the 'A', 'B', and 'C operand register addresses. 

These three addresses allow macroinstructions to spec- 
ify directly three address operations with two read oper- 
ands and a separate write operand. Note however that 
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that these bits are connected to the macro operand 
address counters, which in turn are used to address the 
registerf ile. This is more f uily described in a later section. 

In addition, bits 23:18 are connected to the position 
multiplexer. This allows macro instructions to specify 
directly the ALU position input as the lower bits of the 
opcode. Taking the position information from these bits 
still leaves all of the operand register addresses free for 
use in three address operations. 

Also, bits 4:0 are connected to the width multiplexer. This 
allows macro instoictions to specify directly the width 
Input of the ALU for use in masked operations. Although 
this overrides this field of the opcode for use as the 'C 
operand address, the 'C operand address may inter- 
nally be specified as the same as either the 'A' or 'B' 
operand register addresses. Thus two address macroin- 
stmctions involving width, or width and position specifi- 
ers are possible. 

Macro Opcode Format Restrictions 

Because of the large number of possible macroin- 
struction formats, this application note will not attempt to 
provide a detailed macroinstruction set definition. It is 
only important that the format restrictions imposed by the 
hardware design be stated. 

As defined by connections of the macro opcode register, 
the macro opcode must always be located within bits 



31 :22. The size and positionof the opcode within this field 
are determined by how the macro opcode map RAM is 
set up to interpret and map the opcode. The optional 
opcode modifier (multi-way branch input) must be in bits 
21:18 If it is used. 

The optional position field must be in bits 24:18 if used 
and the optional width field must come from bits 4:0 
when used. 

All three of the operand register addresses are optional 
and if used must come from the fields specified in the last 
section. The operand positions arefixedforthe 'A' and 'B' 
operands since they may only come from the 'A' or 'B' 
operand bits of the macro opcode register. The 'C 
operand address may come from any of the three 
operand fields. 

The reason that the 'A' and 'B" operands do not share the 
positional flexibility of the 'C operand is that the 'A' and 
'B' operands specify registers to be readfromthe register 
file. These read addresses are in the critical timing path 
for the system, and any excess delay in selecting the 
address adds directly to the system cycle time. A multi- 
plexer like that used for the 'C' operand address would 
add undesired cycle lengths. The 'C operand address 
may afford its multiplexer delay since the 'C operand 
address is not used by the register file until late in the 
machine cycle. 
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Each operand address is optional, because the operand 
address may always be specified in the microcode. 

Any optional field, even an unused portion of the opcode 
field, may be used as a data operand. Where a field is not 
used as part of the instruction control, it may be treated 
as data by loading the macroinstmction into the register 
file. Once the instruction is in the data section of the 
system, any data field may be extracted and used in 
calculations. 

Some example macroinstruction formats are shown in 
Figure 5-2. The instructions are shown in a 32-bit word 
layout (byte parity is ignored for the moment). 

Macro Opcode Decoding Method 

The opcode portion of the macroinstruction Is the index 
into the control store forthe location of thefirst instruction 
of a microcode subroutine. Translating the bit pattern of 
the opcode into the microcode store address may be 
done several ways. 

The opcode could be used directly to point to a table of 
first instructions at the base of the microcode store. In 
such a scheme all microcode routines longer than one 
word would require the first word of the routine to branch 
to the remaining part of the routine elsewhere in the 
microcode store. This would break up many routines into 
different parts of microcode store. It may also be ineffi- 
cient, depending on what other functions the branch field 
of the microcode word could have performed if the first 
word of the routine did not have to be a branch. 

The opcode could be used directly with zeros inserted at 
the least significant end to form an address that would 
point to microcode entry points separated by 2, 4, 8, 16, 
etc. words, depending on the numberof zeros appended. 
This would allow more routines to be located in contigu- 
ous words. Only routines longer than the entry point 
spacing would have to be split by branching to other parts 
of microcode store. The disadvantage is that where 
routines are shorter than the entry point spacing, there 
would be unused holes in the microcode store. When 
microprograms are expanded and the microcode store 
gets full (as memories always seem to do), the micropro- 
grams will be split more and more times to fit into the 
unused holes in the microcode store. This will make the 
micro program more difficult to design and debug as the 
microcode store fills up. 

A PAL may be programmed to decode the opcode into 
entry point addresses spaced to fit the microprograms. 
This allows the microcode words of the routines to be 



kept together in consecutive locations, making design 
and debugging of programs easier. But each time rou- 
tines are moved or expanded in size, a new program for 
the opcode mapping PAL must be defined. 

A RAM or PR0I\/1 memory may be used as a look-up table 
for entry points in the microcode store. This allows the 
greatest flexibility. Microcode routines may be located 
anywhere in control store, independent of the opcode 
value. The entry points may be spacedto fit each routine. 
As routines are changed or moved, it is very easy to 
reload the look-up table with new entry points. 

The opcode mapping method chosen for this system is 
the RAM approach. 

Macro Opcode Map RAM 

The map RAM is shown in Figure 5-3. It is formed from 
three Am9150 1 K x 4 bit separate I/O high speed RAMs. 

Together, the three RAMs provide a 12-bit output which 
is used as the microinstruction decode address. The 
address is limited to 12 bits since the maximum size of 
control store provided for in this system is 4K words. 

This decode address is connected to the 'A' address 
Input of the Am29331 sequencer. When this address is 
selected by the sequencer, a branch is made to the first 
microinstruction of the selected routine. 

The address input to all the Am91 50s comes from the 
most significant bits of the Macro Opcode Register (bits 
31 :22). This address selects the entry point into microc- 
ode control store from the map RAM when a macroin- 
stmction is (decoded. The macro opcode register is also 
used during diagnostics and WCS loading to address the 
map RAM. 

The Am9150 RAMs are always selected and output 
enabled since no other device shares the 'A' input of the 
sequencer. Also the Am9151 has no powerdown mode, 
so there would be no advantage to deselecting the 
memory. Note: if lower power in the system is required, 
an alternate memory to use in implementing the map 
RAM would be the Am21 48. That memory does save sig- 
nificant power when deselected and would increase map 
RAM access time only slightly. 

When the Am9150 RAMs are loaded with data, they 
are written with data as though they were an extension 
of the microcode control store. The writable control 
store write enable line is connected to the Am9150's 
write enable input. 
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WCS Port 

Also shown in Figure 5-3 is the Writable Control Store 
(WCS) port. This port is formed from two Am29818-1 
pipeline diagnostics registers. The port was shown in 
block form in Figure 4-5. The port is used as part of the 
system serial diagnostics and writable control store load- 
ing scheme. 

The bidirectional "inputs" of the Am29818-1 are con- 
nected to the macro opcode map RAM data inputs. When 
placed in a special mode, the port "inputs" are driven as 
data outputs. This data is then used as input to the map 
RAM during a WCS write operation. The data comes 
from the Am29818-1's internal shadow register. 

The outputs of the WCS port are connected to the 
microcode control store address lines. The WCS port 
may thus be used as an alternate address source for the 
microcode control store. During a diagnostic read or 
write of the control store, the WCS port provides the 
needed address. 

Note that the data for the outputs of the WCS port comes 
from the Am29818-1's internal pipeline register. The 
pipeline register contents are independent of the shadow 



register contents. This allows an address for the microc- 
ode control store to be in the pipeline register at the same 
time data for the map RAM is in the shadow register. 
These separate registers allow the WCS and map RAM 
to be written in the same cycle as though they were one 
writable control store. 

Macro Operand Address Counters 

These are three identical loadable up/down binary count- 
ers made from AmPAL22V1 PALs. They are shown in 
Figure 5-4. The logic definition file for the PALs is 
shown in Appendix G. 

One counter is used for each operand register address. 
The counters are loaded from the data outputs of the 
macro opcode register. The outputs of the counters are 
tied to the address inputs of the read and write ports of the 
Am29334 register file. 

The counter load, count direction, output enable, and 
count enable functions are internally decoded from in- 
puts that come from the control pipeline register. These 
counters are intended for use in array processing algo- 
rithms, one example being a digital signal processing 
algorithm for a filter. 
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The counters make it simple to perform the same calcu- 
lation on arrays of data stored in the register file. One 
microinstruction or a short microinstruction routine can 
loop on an array calculation and at the end of each 
calculation cycle simply increment the operand address 
counters. In that way, new operands arefetched for each 
calculation on the array without the need for the microc- 
ode instructions to directly specify operand addresses. 

Control pipeline bits determine whether the microcode 
operand addl^ess or the macro operand counter address 
is used. The selection is independent for each operand 
address. Thus, an example would be the operand 'A' , 
address' coming from the microcode while the 'B' 
operand and 'C operand addresses come from the 
counters. 

An additional feature is that the 'C operand counter 
address may be directed to the Am29334 register file 'B' 
write port address input. This allows the 'C operand 
address to come from microcode while the 'C operand 
counter address is used in writing data from system 
merrrary into the register file via the second write port. 
This means that CPU calculations may continue 
uninterrupted while new data is being loaded into the 



registerf lie. Also, as long as data is coming from sequen- 
tial locations in memory and going to sequential locations 
in the register file, the memory address counter and 'C 
operand counter may be incremented together, thus 
loading several memory words in sequence. This loading 
may be accomplished without repeated address calcula- 
tion by the CPU. 

Operand Counter Use Example 

To help illustrate the use of the operand address count- 
ers atypical Finite Impulse Response (FIR) digital signal 
processing filter algorithm is described here. 

An FIR digital filter takes in a stream of amplitude 
samples from an analog waveform. Each sample is 
processed through a series of calculations to produce an 
output value. The resulting stream of output amplitude 
values produces a waveform that is the result of a filter 
operation on the input waveform. 

The calculations involved are a series of multiplies be- 
tween different coefficient values and several past input 
samples. The result of each multiply is accumulated to 
produce one output value. The number of coefficients 
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and retained past samples determines how selective 
the fitter operation is. The values of the coefficients de- 
termine the type of filter operation; e.g., bandpass vs. 
lowpass. 

The algorithm for calculating one output value would be 
the following: 

Sum := 0; 

torn = to number_of_coefficients do 
Sum := Sum + (Sample{x - n) * Coefflcient{n)); 

Each time a new input sample is acquired, the new 
sample becomes Sample(x), and all past samples shift 
down in the sample array such that Sample(x - 1) := 
Samp!e{x) for all x. Note that the number of retained past 
samples is equal to the number of coefficients. 

This algorithm may be implemented with two arrays of 
data and a temporary register. One array contains coef- 
ficients and the other contains past input samples. 

The coefficient and sample operands may be nwltiplied 
in a single system cycle by eitherthe Parallel Multiplier or 
the Floating Point Processor. The Parallel Multiplier may 
also perform an accumulate in the same cycle. The 
Floating Point Processor requires a second cycle to do 
the accumulate function. So for each multiply and accu- 
mulate operation on a sample-coefficient pair, either one 
or two cycles are needed. 

Obviously the operand counters may be used to address 
the data arrays. As each coefficient-sample pair is multi- 
ply-accumulated, the counters are Incremented to point 
to the next pair of operands. This allows the inner 
mulliply-accumulate loop to be only one or two microin- 
structions long. 

One feature of the operand counters adds to the effi- 
ciency of this algorithm. When an operand counter 
reaches either the maximum or minimum count value, 
the counter will reload the original count value from the 
macro opcode register on the next increment. This cre- 
ates a counter that may treat the registerfile as acircular 
buffer. The length of the buffer is determined by the 
distance from the original count value to eitherthe base 
or upper limit of the register file address. 

Note also that if one counter is always incremented while 
the other is decremented, two circular buffers may share 
the register file. One has a lower bound of zero and the 
other an upper bound of 63. With this scheme two equal 
size buffers could be up to 32 words each. 

The circular buffer approach to the arrays works well with 
the FIR filter algorithm. At the end of each output value 



calculation, the counter addresses will point back to the 
first coefficient-sample pair, ready for the next input 
sample iteration. 

Note that if on the last multiply-accumulate cycle of an 
iteratation the sample operand counter is not incre- 
mented, and the 'C operand counter is used to load a 
new sample from memory into the oldest sample array 
location, the effect will be to shift all the samples down by 
one in the array while overlapping the new sample load 
with the last cycle of a sample iteration. 

One additional cycle at the end of each iteration may 
move the output value from the register file to the mem- 
ory. No memory address calculation cycle is needed 
since the memory address counter may be used to 
address the memory. 

With this scheme only one cycle of overhead between 
iterations is needed. Therefore, assuming clocked multi- 
ply operation of the PM to achieve single cycle multiply- 
accumulate execution, a 31 coefficient FIR could com- 
plete one output value iteration in 32 cycles. Assuming a 
100 ns cycle time (100 ns clocked multiply in the PM), 
that would allow over 31 2,000 samples per second or an 
input bandwidth of over 156 kHz. A 9 coefficient filter 
would have a 500 kHz bandwidth. 

This is an example of how a microprogrammed system 
may have its architecture tuned to a particular applica- 
tion for the best possible performance. Much of the 
performance comes from the microprogrammed 
system's ability to control and perform several parallel 
functions at one time. 

REGISTER FILE ADDRESS MULTIPLEXER 

The Register File Address Multiplexer, shown in the 
block diagram of Figure 1 -2, is made up of four sepa- 
rate multiplexers. One multiplexer is used for each regis- 
ter file address port; two read ports and two write ports. 

Read Ports A and B 

These multiplexers are shown in Figures 5-4 and 5-5. 
Each multiplexer is really a three-state bus that may be 
driven either from the control pipeline register via an 
Am29827 three-state buffer or from an operand counter 
output. A bit for each address from the control pipeline 
selects which source may drive each address bus. 

The Am29827 three-state buffers are needed in addition 
to the three-state outputs of the control pipeline because 
each operand address is 6 bits. This number does not fit 
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well into the 4-bit boundaries of eacti slice of tfie microc- 
ode control store. So to avoid wasting control store bits, 
the external tliree-state buffer is used to gale the control 
pipeline address onto the register file address bus rather 
than trying to use the control store's own three-state 
outputs. 

Write Port A 

This multiplexer is implemented by a pair of AmPALI 8P8 
PALs. It is shown in Figure 5-6. The logic definition file 
for the PAL is contained in Appendix H. 



It is this four input hex multiplexer that allows the 'C 
register file operand (i.e., register file 'A' write port) 
address to come from four possible sources. The ad- 
dress may be provided from the 'C operand in the control 
store, 'C operand counter, 'A' operand final address, or 
'B' operand final address. The 'A' and 'B' operand ad- 
dresses are referred to as final because the multiplexer 
input is taken from the register address buses after the 
choice between control pipeline or operand counter has 
been made for the 'A' and 'B' operand addresses. The 
select bits for the multiplexer come from the control 
pipeline. 
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Write Port B 

This multiplexer is made from an AmPAL22V10. It 
operates as a two input hex multiplexer. It Is shown in 
Figure 5-7. The logic definition tile for the PAL is given 
in Appendix I. 

It selects either the control pipeline 'C operand address 
or the 'C operand counter address as the source for the 
register file 'B' write port address. The select bit comes 
from the control pipeline register. 

POSITION AND WIDTH MULTIPLEXERS 

The position and width multiplexers are implemented 
with AmPAL22V10A PALs. They are shown in Fig- 
ure 5-8. The logic definition file for the PALs is given in 
Appendix I. 

Each is a two input hex multiplexer, identical to the 
multiplexer used for the B Write Port Mux. They select 



from the Position and Width values that may be provided 
either from the control pipeline or the Macro Opcode 
Register. The select control comes from the control 
pipeline. 

'A' speed PALs are used here since these multiplexers 
are in the critical path to the ALU. They must use 7 ns 
less delay than the combined delay of the 'A' Read Port 
Mux and Register File access time. The required 7 ns 
advantage is consumed by the ALU's longer propagation 
delay from Position input to Y output vs. Data input to Y 
output. 



SEQUENCER 

The sequencer is a 16-bit-wide address generator that 
controls the execution sequence of microinstructions 
stored in the microcode control store. It may handle 
interrupts or traps at any microinstruction boundary. 
An interrupt or trap is treated like an unexpected pro- 
cedure call. 
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Two independent branch inputs as well as four multi-way 
branch address sources are provided. One of the branch 
address inputs is bidirectional and may be used to read 
or write information in the sequencer's internal 33-level 
deep stack. 

A 16-bit counter, test condition multiplexer, and break- 
point address comparitor are also provided. The break- 
point comparitor is used as a hardware aid to microcode 
debugging. The connections to the sequencer are shown 
in Figure 5-9. 

The sequencer's 'A' branch address input is connected to 
the Macro Opcode map RAM output and is the path 
through which the macroinstruction specifies its entry 
point into microcode. 

The 'D' branch address input is tied to the D_BUS. 
Through this path, branch addresses or constants come 
from the control pipeline register and data may be ex- 
changed with the data section of the CPU. 



The 'MO' multi-way branch address input is connected to 
the macro opcode register bits 21:18. These bits may be 
used as a modifier to the macro opcode via a multi-way 
branch based on these bits. 

The 'Ml 'multi-way branch address inputs come from the 
Floating Point Processor (FPP) external status register. 
These bits are the overflow, underflow, invalid, and 
'extra' status flags from the FPP. The 'extra' status flag is 
the ORof the zero, NAN, and inexact status flagsfrom the 
FPP. A single multi-way branch on these inputs may be 
used to detect and handle quickly any of the catastrophic 
status conditions from the FPP. If the 'extra' flag is active, 
it indicates that a second multi-way branch may be used 
to determine which of the 'extra' status flags is active. 

The FPP zero, NAN, and inexact status flags are con- 
nected to the 'M2' multi-way branch input of the se- 
quencer. 
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The 'M3' multi-way branch input is tied to the ALU 
microprogram status outputs so that an altemate means 
of checking ALU status is available. A multi-way branch 
based on these bits is able to check multiple condition 
flags in a single cycle. 

The Force Continue and Carry-In inputs of the sequencer 
are active in a trap operation to prevent state change in 
the sequencer and capture the address of the trapped 
instruction in the interrupt retum address register. Carry- 
in (CIN*) is driven high by a trap event signal from the trap 
logic in Figure 5-11. The trap event signal is also ORed 
with a signal from the control pipeline (P_FC) so that 
either signal will cause Force Continue to go high. The 
intermpt request input comes from the Trap circuit shown 
in Figure 5-11. 

The sequencer's HOLD input is driven by the inverted 
value of the WCS_WR* signal from the host interface 
controller shown in Figure 4-3. When this signal is 



active, the sequencer's output will be three-stated so 
the WCS Port may drive the microcode control store 
address lines without contending with the sequencer's 
output drivers. 

The Slave input is grounded since no use of the mode is 
made in this demonstration system. 

The test condition inputs of the sequencer come from 
three sources. Conditions 1 1 though 7 are the ALU status 
bits for zero, overflow, sign, carry, and link. Conditions 6 
through 2 come from the Macro Status Register; these 
bits are the macro version of the same ALU status bits. 
Condition 1 comes from the FPP external status register 
bit for zero. Condition is unused. 

Control for the sequencer's interrupt enable, test condi- 
tion select, and instruction input comes from the control 
pipeline register. 
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The sequencer's D_BUS output enable comes from the 
control decode logic. 

The sequencer A_FULL signal is used as an interrupt 
signal to the system intenxjpt controller. 

The Equal (breakpoint) signal is used as a trap event 
signal to the Trap Logic. 

Interrupt acknowledge goes to the interrupt controller 
and trap logic to enable the intermpt and trap vectors onto 
the microcode control store address bus when an inter- 
mpt is executed. 

The 'Y' outputs of the sequencer drive the microcode 
control store address lines to select each microin- 
struction. 

D BUS TRANSCEIVER 

The transceiver between the A_BUS and the D_BUS is 
shown in Figure 5-10. 

The D_BUS has no parity bits included where as the 
A_BUS does contain parity. It is therefore necessary to 
provide parity generation for the data moved from the 
D_BUS to the A_BUS. 

The D_BUS is only 16 bits wide vs. the 32-bit-wide 
A_BUS. Thus it is also necessary to provide bus drivers 
and parity generators for the upper two bytes of the 
A_BUS, even though no variable data is passed to the 
A_BUS from the D_BUS through those bits. 

The transceiver and parity generator/checker function 
are combined in a single device type: the Am29853. Four 
of these are used In addition to an Am29862 inverting 
transceiver. The inverting transceiver is used on the 
parity bits because the Am29853 uses odd parity while 
the Am29300 system uses even parity. 

As an added convenience for when numeric constants 
are passed from the D_BUS to the A_BUS, an AND gate 
is provided to drive the inputs of the upper two bytes of 
transceiver. If the AND gate is enabled by the control 
pipeline, the most significant bit of the D_BUS will be 
copied to all the upper bits on the A_BUS, thus perform- 
ing a sign extend for two's complement numbers. If the 
AND gate is disabled, the upper bits of the A_BUS are 
forced to zero. 



INTERRUPT CONTROL 

Interrupt and Trap Philosophy 

What Is a Trap? 

Traps are events that require the immediate attention of 
the CPU. The urgency of the event is so great that the 
CPU must not even complete the execution of the in- 
stmction in progress in the cycle that the trap request 
happens. The CPU must not change any machine state 
in that cycle; it must store the address of the instruction 
that was to have been executed and must branch to a 
routine that services the trap event. 

The implication here is that the trap will prevent some 
disastrous change in machi ne state from which no recov- 
ery would be possible. Also implied is that the trap 
sen/icing routine may repairwhat everthe problem is and 
then return to complete the execution of the instruction 
where the trap occurred. 

One additional implication is that the trap event may be 
signaled early enough in the instructfon cycle to prevent 
the clocking (change of machine state) that normally 
occurs at the end of each instruction. 

An example of a trap event could be a miss on cache 
memory access. To complete an instruction when the 
data being accessed from a cache is invalid would be a 
disaster with little chance for recovery. If a trap routine to 
update the cache may be executed Instead of completing 
the instruction, the program may be saved. After the 
cache has the correct data, the trap routine may return to 
the aborted instruction to continue execution of the 
program as if no problem had existed. 

Another example of a trap would be a program break- 
point. When debugging a program it is very useful to be 
able to stop execution of a program just before executing 
a particular instruction. If this is done, the state of the 
machine before executing the breakpoint instruction may 
be examined. To do this the address of the breakpoint 
instruction is recognized as the instruction isfetched from 
microcode control store. In the next cycle before the 
instruction may complete, a trap occurs which branches 
to a debugging routine. When the programmer is ready to 
continue the program, a return from trap completes the 
execution of the breakpoint instruction. The breakpoint 
trap operation is easy to do, and hardware to implement 
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it is already provided in the Am29331 sequencer. The 
breal<pointtrap operation will be shown in the Trap Logic 
described later. 

What is an Interrupt? 

Interrupts are events that require the attention of the CPU 
soon. 

"Soon" is defined asfasterthan might happen if the event 
were polled by a CPU program but later than a few 
microinstruction execution cycles. 

Interrupt events and the resolution of an intermpt are not 
directly tied to the CPU state. No disasters occur if a few 
cycles pass by before the interrupt may be handled. 

Examples of events handled via interrupt could be: 
external mechanical events such as switches being 
opened or closed, an impending stack-full situation, a 
message signal from another processor, or a peripheral 
delay timer Indicating time-out. 

In this demonstration system one other class of intermpt 
source is included. It is the parity error. A parity error 
implies corrupted data in a program that cannot be 
corrected. Since the influence of corrupted data on the 
program is difficult to determine or correct for, the af- 
fected program should be aborted. A parity error is, 
therefore, important to detect so that the program in 
which it occurs may be terminated and perhaps rerun 
with corrected data. 

Parity errors are treated as interrupts rather than traps for 
two reasons. The indication that an error has occurred 
comes fairly late in an instruction cycle and is therefore 
difficult to use as a trigger for a trap. When a parity error 
occurs, the program is generally corrupted and will be 
terminated; whetherthe termination happens in the cycle 
following the en-or as would be the case with a trap, or 
within a few cycles, as with an interrupt, is unimportant. 

Interrupt Operations 

There is no need to design an interrupt circuit from 
scratch when one already exists. The Am291 14 interrupt 
controller is used in this system. It provides Interrupt 
latching, priority, masking, and vector generation for 
eight interrupt inputs. 

Interrupt Controller 

Six interrupt sources are used in this Am29300 system; 
the two remaining interrupt source inputs are available 
for software generated interrupts. 



The interrupt and trap circuit block diagram is shown in 
Figure 5-11. 

The three highest priority interrupts are parity error sig- 
nals from the D_BUS, the Am29C323 Parallel Multiplier, 
and the Am29332 ALU. 

The next priority interrupt is a signal from the FPP 
extemal status PAL, which indicates that one of the 
following status flags is active: Overflow, Underflow, or 
Invalid. 

The next priority interrupt is the A_FULL signal from the 
Am29331 sequencer. This interrupt indicates that the 
sequencer stack will be full if three additional stack 
pushes occur. 

The next interaipt is the external bus interrupt signal from 
the host interface controller. This is a "tap on the shoul- 
der from the host that requests the Am29300 CPU take 
some previously agreed on action, such as reading a 
message from the host out of memory. 

The two least significant interrupts are unused by hard- 
ware and are available for use as software interrupts. 
These interrupts would be set by the CPU writing into the 
Am291 14 interrupt register. 

The interrupt mode is setforcapturing asynchronous low 
going pulses as interrupt signals. This is done because 
most of the interrupt signals are only guaranteed to be 
active for a single clock cycle. Therefore, the internjpts 
must be latched and held by the interrupt controller until 
acknowledged by the CPU. 

The D_BUS Is connected to the intermpt controller data 
pins so that the internal intermpt, mask, and in-service 
registers may be read and written. 

The intermpt controller is selected and given instmctions 
via outputs of the control pipeline register. 

Interrupt Sequence 

During a given clock, one of the interrupt inputs goes 
active. At the end of that cycle (active edge of clock), the 
interrupt signal is clocked into the interrupt register of the 
Am29114. 

During the second clock cycle, the interrupt is ANDed 
with the intermpt mask register and, if the interrupt is 
allowed, its priority is compared to any currently in- 
service interrupt. If the new interrupt is of higher priority 
than any in-service interrupt, the MINTR* (intermpt re- 
quest) will go active at the next active clock edge. 
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During the tliird clocl< cycle, the Am29114 interrupt 
request is externally ORed with the interrupt requestfrom 
the trap logic. The combined interrupt request is then 
loaded into a delay flip flop. The delay flip flop is needed 
to synchronize the final intenupt request with the system 
clock.The reason forthis is that the interrupt requestfrom 
the Am291 14 is stable too late (41 ns) in the third cycle 
to be useful in selecting an Interrupt address. The set-up 
timeforthe microcode control store address could not be 
met if the Am291 14 interrupt request were used directly 
with the Am29331 sequencer. 

The external OR and delay functions are imple- 
mented in an AmPAL22V10A, whose logic is shown in 
Figure 5-12. 

During the fourth clock cycle, the INTR* (intermpt re- 
quest) input of the sequencer is driven by the delay flip 
flop. The sequencer then returns INTA* (intermpt ac- 
knowledge) if micro-interrupts are allowed. The INTA* 
signal enables the interrupt vector onto the microcode 
control store address lines. 



The LSB three bits of the intermpt vector are provided by 
the Am29114 interrupt priority encoder. Bit 3 of the 
interrupt vector is provided by the trap logic. The bit is low 
for an intermpt and high for a trap vector. The upper bits 
(4:11) of the vector are provided by an external 
Am29818-1 register. This register provides a variable 
base address for a nine entry point table look-up (multi- 
way branch), which is based on the four bits of intermpt 
vector from the Am29114. The Am29818-1 register is 
loaded via the D_BUS or through the diagnostics SSR 
chain. The need for a nine entry point table is explained 
in the section on trap operation. 

During the fifth clock cycle of the interrupt sequence, the 
first instruction of the intermpt routine will execute. Dur- 
ing this cycle the intermpt return address will be pushed 
onto the sequencer stack. 

In summary, from the time an intermpt signal becomes 
active until the intermpt sen/ice routine begins execu- 
tion, four instmctions in the main program will complete 
execution. 
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Figure 5-12. U75 AmPAL 22V10A Trap Logic PAL 
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Trap Operation 

Trap Issues 

A trap requires extremely fast response to the trap event 
signal. 

The ideal situation is forthe trap event signal to cause the 
abortion of the Instruction in execution at the time the 
event signal appears. 

This is extremely difficult in a high clock frequency 
system. To succeed, the trap event signal must be stable 
at least in time to prevent clocking of the data section of 
the CPU, which would othenwise change the system 
state (i.e., complete execution of the instruction). This 
implies that the trap event signal is stable one clock 
control circuit set-up lime before the high to low edge of 
the system clock. The hIgh-to-low edge of clock Is signifi- 
cant, because once the clock signal falls, the writing of 
any write enabled port on the Am29334 register file will 
begin. In addition, the trap event signal must be stable in 
time to cause the Am29331 sequencer force continue 
(FC), interrupt request (INTR), and carry in (CIN*) signals 
to go high soon enough to disable the sequencer micro- 
program address in time to meet the set-up time require- 
ments of the microcode control store. 

In a 100 ns cycle time system, such as the one being 
discussed here, the trap event signal must be valid no 
later than 25 ns into the cycle. For a trap event signal 
that is to be derived from the effects of the instruction in 
execution in that cycle, this requirement is very difficult 
to meet. 

Fortunately there are trap events that may be signalled 
on the one or two cycles previous to the cycle in which the 
trap must occur. Some examples would be: a cache miss 
that may be detected from the cache address created in 
a cycle prior to that in which the cache data is used in a 
calculation;orabreakpointinwhichthe breakpoint target 
instruction address is detected by the sequencer in the 
cycle priorto the instruction being loaded into the control 
pipeline for execution. 

If a an instruction is a known potential trap, it is possible 
to execute the instmction so that no critical information is 
destroyed by completing its execution. This may be done 
by writing results back to a temporary register while 
allowing no other significant system state changes, such 
as updating the ALU Q register, or doing a return from 
procedure call. The instruction may then be allowed to 
execute and generate any trap event signals that might 
result from the execution, without concern for irrevocably 
destroying data because of some error condition. 



In the above examples, the trap event signal may be 
loaded into a delay flip flop to synchronize the trap 
request with the beginning of the following cycle. This 
causes the trap operation to occur early in the cycle 
following the event and to complete successfully. 

The only trap condition implemented in this design is the 
breakpoint. 

Trap Logic 

By definition, the response time between trap event 
signal and trap operation must be much faster than the 
four or more cycles that an Interrupt takes to begin 
execution. This requires that the trap logic be different 
from the Am29114 interrupt controller. The trap logic 
design is implemented in an AmPAL22Vi OA. The logic is 
shown in Figure 5-1 2. The definition file for the PAL is 
shown in Appendix J. 

The trap logic is in effect a simpler and faster interrupt 
controller. This 'Irap controller" is cascaded with the 
Am291 14 intermpt controller so that the same address 
vector approach used with the interrupt controller may be 
extended to trap operations. 

A trap is treated as a special form of interrupt with a higher 
priority. When a trap occurs, the trap logic generates a 
cascade out (CAS0UT2) signal to the Am29ll4 to 
prevent any interrupt operation from beginning in the 
same cycle. 

The trap logic also generates an INTR signal to the 
Am29331 sequencer. The INTR signal in turn causes the 
sequencer to three-state its microcode address outputs 
and return an INTA signal to the trap logic. The INTA 
signal enables a four bit vector from the trap logic and the 
interrupt base address from the Am2981 8-1 registers as 
shown in Figure 5-11. 

The above steps essentially generate an interrupt and 
provide the interrupt vector. What makes a trap different 
is that the Trap Logic is also used to drive the Am29331 
sequencer Force Continue and Carry-In inputs. This 
causes the sequencer to ignore the instruction being 
trapped and to perform a continue instruction instead, 
which changes no state in the sequencer. The CIN* 
signal's being high causes the trapped instruction ad- 
dress to not be incremented. Therefore, the trapped 
instruction's address will be loaded into the sequencer 
interrupt return address register. In addition, the TRAP 
signal is used to prevent any state change in the system 
other than in the sequencer, effectively aborting the 
trapped instruction. 
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Following are some other features to note In the trap 
logic. 

Am29300 system RESET is used to generate the se- 
quencer Carry-In signal (SEQ_CIN*). This is done to 
force SEQ_CIN* high during reset so that the first microc- 
ode instruction executed after reset will be at address 
zero rather than one. 

In order for a trap operation to take effect, the instruction 
that is to be trapped must have its microcode interrupt 
enable bit active. This bit is used as the interrupt enable 
to the sequencer. If it is not active, then the microcode 
control store address from the sequencer will not be 
three-stated, and the interrupt vector will not be substi- 
tuted. In addition, the TRAP signal will still occur, causing 
the trap target instmction not to execute correctly. Note 
that the interrupt enable bit could be externally forced 
active by the trap operation via an OR gate. But the added 
delay could cause the interrupt acknowledge to be too 
late to allow the interrupt vector address to meet required 
set-up times. (Of course, it is possible to design the 
system so that every trap causes all the system clocks to 
be stopped for one cycle. That would allow enough time 
for all kinds of tricks to be played. This design, however, 
will not explore that approach.) 

MICROCODE CONTROL STORE AND 
CONTROL PIPELINE REGISTER 

Control Store Function 

The microcode control store is the high speed memory 
that contains the control bits comprising the instructions 
that the system may execute. 

This system uses what is called "horizontal" microcode. 
Each microinstruction contains many control bits that 
manage a variety of different functions in parallel. "In 
parallel" is the key phrase. All the control information 
needed to manage the entire Am29300 system during 
the execution of one microinstruction is contained in one 
word of microcode control store. 

The memory must be fast because its access time must 
be significantly shorter than the cycle time of the system. 
In general the access time must be less than half the 
cycle length. This is because of the time required by the 
sequencer to generate each new address to the control 
store, which takes up the remaining time in the cycle. 

Pipeline Register Function 

At the output of the microcode control store there is a 
register to hold the control information stable during the 



execution of an instruction. With the control information 
held in the pipeline register, the control section of the 
CPU is free to begin reading the next microinstruction 
from the control store. In this way, the control section is 
operating in parallel with the data section. The control 
section fetches the next instruction while the data 
section executes the current instruction. This parallel 
operation, where one section of the system works on one 
step of a problem while another section works on the 
next step, is called pipelining, hence the name for the 
pipeline register. 

Through parallel operation, pipelining nearly doubles the 
speed of the system over what might be the case if the 
control section and data section were directly tied to- 
gether in a serial fashion. 

Control Store Implementation 

Because this method of pipelining the output of a mi- 
crocode store is so popular, there are special memories 
available that combine a high speed memory with a 
pipeline register at its output. These combined memory 
and pipeline devices may significantly reduce the 
system parts count. 

These memories are available as either RAM or 
PROM devices. RAM versions are used to make 
writable control stores. 

These memories also include Serial Shadow Registers 
(SSR) along with the pipeline register. This allows diag- 
nostic routines to read and control the pipeline register 
outputs. Where RAM versions are used, the SSR is used 
as a built in means to load the writable control store. 

This system is designed to use one of the following for 
control store: Am9151-50, 1K x 4 RAM; Am27S65, 
IK X 4 PROM; Am27S75, 2K x 4 PROM; or 
Am27S85, 4K x 4 PROM. These devices all Share a 
similar pinout so that simple jumper connections allow 
any of them to be placed in the same sockets. 

The connectionsto the control store are shown in Figures 
5-13 and 5-14. 

A total of 23 memories are used to form the needed 92- 
bit-wide microcode words. 

Because this system is designed to use no more than a 
4K word deep control store, only the lower 12 bits of 
microcode address from the sequencer are connected. 

The memories in the control store which provide the 
microcode branchfield are connected differentlyfromthe 
remaining memories. This is because the branch field 
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Figure 5-13. Microcode Control Store 
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Figure 5-14. Microcode Control Store 
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outputs are connected to the D_BUS and must be three- 
stated when other devices drive the D_BUS. All the other 
outputs of the control store are always output enabled. 

Figure 5-13 shows how the bulk of the control store is 
connected. 

When the Am9151-50 or the Am27S65 is used, the 
jumper at location "B" is connected. This continuously 
enables the memory. 

When the Am27S75 is used, the jumpers at locations A 
and D are connected. Also, the Am27S75 G/Gs* (pin 20) 
is internally programmed as an asynchronous enable. 
Those jumper connections will always enable the mem- 
ory and connect address bit 1 to it. 

When the Am27S85 is used, the jumpers at locations A 
and C are connected. The Am27S85 G/Gs/l/ls* (pin 19) 
is programmed as a synchronous initialize function. 
Those connections will always enable the memory and 
provide address bits 1 and 11 to it. 

Figure 5-14 shows the connection for the memories 
that support the branch field. 

When the Am9151-50 or the Am27S65 is used, the 
jumpers at location B and E are connected. This enables 
the memory when the control pipeline selects the control 
store to drive the D_BUS. 

When the Am27S75 is used, the jumpers at locations A, 
D and E are connected. Also, the Am27S75 G/Gs* (pin 
20) is internally programmed as an asynchronous en- 
able. Those jumper connections will enable the memory 
when the control pipeline selects the control store to drive 
the D_BUS. 

When the Am27S85 is used, the jumpers at locations A. 
C, and F are connected. The Am27S85 G/Gs/l/ls* (pin 
1 9) is programmed as an asynchronous enable function. 
Those connections will enable the memory when the 
control pipeline selects the control store to drive the 
D_BUS. Also, these connections imply that when the 
Am27S85 is used, the branch field of the initialize word 
will not be valid. 



CLOCK CONTROL 

In almost every complex digital system there is a need to 
control and qualify selectively the system clock. 

A registeroften needs a qualified clock that will clock (i.e., 
load) the register only when specified by some control 
signal. Sometimes a register will internally qualify its own 



clock by providing a load enable input. But most often, 
registers have only data input and outputs, an output 
enable, and an unqualified clock input. It is up to the 
system designerto provide a means to restrict the clock 
to the register so that it receives clock only on those 
cycles when its load enable control signal is active. 

Restricting a clock in this fashion is referred to as quali- 
fying a clock. The controlling signal that enables the 
qualified clock is called the qualifier. 

Most synchronous digital systems have a system clock 
with a single active edge. This means that the system 
state will only change on either the low-to-high or high-to- 
low edge of the clock. The opposite transition of the clock 
will have no state changing effect in the system. The 
opposite transition of the clock is referred to as the 
inactive edge of the clock. It should be noted, however, 
that, even though there is a single active edge for the 
clocking of registered states in the system, the level of the 
clock may have an effect on some multiplexers or latches 
in the system. The level of the clock may control the path 
selected by a multiplexer, whether a latch is flow-through 
or held, or the write enable of a memory. 

To qualify a clock, there must be a way to prevent the 
active edge from occurring. This implies that the clock is 
held either high or low when it is prevented from cycling. 
The choice of whether the clock will be stopped (held) at 
its high level or low level may depend on what, if any, 
effect the level of the clock has on system multiplexers, 
latches, or memories. For example, if the low level of the 
clock enables a memory write line, it may be preferred to 
stop the clock at the high level rather than the low level to 
prevent any change in state of the memory. 

Clock Qualification Circuit 

In the Am29300 system described here, the system clock 
will be stopped at the high level. This is because the low 
level of the clock may start the writing of data into the 
Am29334 registerfile. The active edge of the clock will be 
the low-to-high transition. 

This method of qualifying clocks is referred to as 'OR' 
qualification. Usually with this method the free-running 
(unqualified) version of the system clock is 'ORed' with a 
low active enable signal. Thus, if the enable is active (low) 
the resulting qualified clock Is allowed to track the free 
running clock. If the enable is inactive (high) the qualified 
clock will be forced high, stopping the clock, until the 
enable again goes active. Because the free running clock 
is always high during the first portion of each clock cycle, 
the clock enable signal need not be stable until just before 
the inactive edge of the free running clock. 
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In this Am29300 demonstration system the following are 
the desired controls over the system clocks: 

1 . The ability to stop all clocks to the Am29300 CPU, 

both control and data sections. This will suspend 
operation of (halt) the system. 

2. The ability further to qualify register loading 
(register clocks) with control pipeline signals. 
The controlled registers would be the Macro 
Status, Macro Opcode, and Interrupt Base 
Address register. 

3. The ability to single step all the system clocks 
when the systemclocks are in the halt mode. Note 
this implies only conditional single stepping on 
those register clocks that are further qualified by 
load enable controls. 

4. The ability to single step the data section or the 
control section independently. 

5. The ability to force the control pipeline or the 
Macro Status, Macro Opcode, and Intermpt 
Base Address registers to load. This capability 
is used to implement diagnostic control over 
these registers. 



To implement this kind of control over the system clocks, 
a separately qualified version of the system free running 
clock must be created for each differently handled regis- 
ter. The general clock for the control section is different 
from that for the data section. Also, each qualified regis- 
ter clock is different. 

The block diagram for the clock qualification circuit is 
shown in Figure 5-15. The logic equation definition file 
for the PAL in this circuit is shown in Appendix K. 

The qualifiers for the system clocks come from either the 
control pipeline, trap logic orthe host interface controller. 
The AmPAL22V10A Programmable Array Logic (PAL) 
device is used to combine the various qualifiers into the 
appropriate clock enables for each differently handled 
set of registers. The output of the PAL Is then logically 
ORed with the system free running clock to form the 
various qualified clocks in the system. 

In this system, the free running clock generator produces 
an active low clock with the enables active high. By using 
negative logic OR gates (NAND gates) the clock and 
enable signals are logically ORed together to produce 
active high qualified clocks. The negative logic OR gates 
are external to the clock qualifier PALs. 
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Figure 5-15. Clock Qualification Block Diagram 
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The NAND gates also serve as high output current 
buffers that allow the qualified clocks to drive many 
registers in the system. These NAND buffers also cause 
the clocks to have very high speed edges. This requires 
that clock lines be handled more carefully than other 
signal lines to help prevent noise, reflections, and ringing 
on the clock lines. Preventing these problems helps to 
ensure clean clock signals free fromthe glitches that may 
cause missed clocking or double clocking of registers. It 
is suggested that clock lines be routed serially, kept less 
than 12 inches in length, and terminated to the printed 
circuit board's characteristic impedance at the last point 
of use on each clock line. 

Note that all the system clock lines, even the free-running 
clock line, pass through a NAND gate. This is done to 
equalize the delay of all clocks so that clock skew in the 
system is minimized. 



Clock Generator 

The unqualified (free running) source for all the clocks in 
the system comes from a clock generator implemented in 
an AmPALl 6R6B. A diagram of the logic implemented in 
this PAL is shown in Figure 5-16. The logic equation 
definition file for this PAL is shown in Appendix L. 



The only reason that a clock generator PAL is used in 
addition to a simple clock oscillator module is to provide 
the ability to vary dynamically the length of each system 
clock cycle. This ability allows the system to run at the 
maximum clock rate most of the time when the fastest 
data paths are in use and to run at aslowerrate only when 
slower system data paths are in use. By slowing the 
system cycle time dynamically only when a slow data 
path is used, the average system speed is much higher 
than would be the case if the system clock rate were fixed 
at the rate required by the slowest data path. 

A simple way to do this would be to divide the normal 
system clock by two and on each cycle select whether 
the normal length or the double length clock cycle would 
be used. 

In this system, finer control over the length of each cycle 
is desired. Where the cycle need only be a little longer 
than usual, only a slightly longer cycle is used ratherthan 
doubling the cycle length. 

This is done by dividing down a high speed clock, which 
runs three times faster than the normal system clock. It is 
then possible to extend a clock cycle in increments of the 
high speed clock. A cycle then may be 1 , 1 1/3, 1 2/3, or 
2 times the normal cycle length. 
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Figure 5-16. U100 AmPALl 6R6B Clock Generator 
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The Am29300 demonstration system's normal clock Is 
1 MHz, or 1 00 ns, long. The high speed clock Is then 
30 MHz and Is provided by a commercially available 
clock oscillator module. 

The control overthe cycle length comes from the control 
pipeline register and may thus be specified differently on 
each instruction. Two bits are provided to select one of 
the four cycle lengths. Each instruction may thus control 
its own cycle length based on the time required by the 
data paths that are used. 

The waveform of the clock may be described in terms of 
the number of high speed c|ock periods during which It is 
active and then inactive. 

Note that the output of the AmPALI 6R6 is inverting. The 
logic internal to the PAL creates an "active high" clock 
with a low-to-high active edge. This waveform is inverted 
by the final output of the PAL and is later inverted once 
more in the clock qualifying circuit. The final system 
clocks are thus active high. When describing any system 
clock, it will be done in terms of an active high clock. The 
clock generator waveform Is shown in Figure 5-17, 
where the outputs are shown active high, even though 
the actual PAL output Is inverted. 

Each clock cycle has two or more active periods followed 
by one inactive period. 



The clock generator PAL output IsfromaDflipflop. When 
the flipflop output Is Inactive (low), one term feeds back 
the inverted output. This will force the flip flop high on the 
next high speed clock. The output of this flip flop feeds a 
shift chain of four other flip flops, which act as a simple 
timer for the extended cycle lengths. 

During the first active period of the clock output, the 
output of the first flip flop in the timing chain is still inactive. 
This first flip flop's output is inverted and fed back into the 
clock output flip flop to force the clock output to remain 
high for a second active period. 

During the second active period, the clock cycle length 
bits from the control pipeline become stable and deter- 
mine whether additional active periods will be inserted 
Into the output clock. 

Note that since the first two periods of active clock are 
forced by the logic, the control bits need not be stable for 
two high speed clock periods minus the PAL set-uptime 
(66.6 ns - 15 ns = 51 .6 ns). This time margin is further 
reduced by the skew between the high speed clock and 
the qualified clock to the control pipeline which is equal to 
the clock-to-output time of the clock generator PAL plus 
the propagation delay of the qualifying NAND gate 
(51 .6 ns -(10 ns +5.5 ns) = 36.1 ns). Therefore, as long 
as the control pipeline register clock-to-output time does 
not exceed 36 ns, the clock generator will work as 
described here. 
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Figure 5-17. Clock Generator Outputs (Inverted) 
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If the clock cycle length bits are zero, no additional 
feedback terms are enabled and the clock output flip flop 
will go low In the next high speed clock period. 

If the clock cycle length bits equal 1 , the output of the 
second timing chain flip flop is fed back to the output flip 
flop to allow one additional active clock period. 

Similarly, when the clock cycle length bits are equal to 2 
or 3, an additional 2 or 3 active periods are inserted in the 
output clock waveform. 

When the clock output flip flop again goes inactive, its 
output will force all of the timing chain flip flops to be 
cleared, thus beginning a new Am29300 clock cycle. 

MICROCODE WORD 

This section describes the structure and function of each 
field of bits in this system's microcode word. Included are 
some comments on how functions were determined and 
how they might vary in similar systems. 

Control Philosophy 

In a microprogrammed system, each word of the microc- 
ode functions as the determinate of ail system action 
during one clock cycle of system operation. Each bit 
directly affects some aspect of the machine. Each field of 
bits may act independent of other fields to manage 
parallel data paths and simultaneous operations. This 
ability to manage parallel activities In each machine cycle 
gives a microprogrammed system high speed and flexi- 
bility. But the power of complete parallel control over 
nearly all the functions in a system comes at a cost. 

The cost is wide control memory words. Fifty- to 1 50-bit- 
wide control words are common in microprogrammed 
systems. Three hundred-bit-wide control words have 
been used in large mainframe computers for years. 

With each machine Instruction's eating up 1 00 or more 
bits of memory, it doesn't take long to consume signifi- 
cant board space, power, and cost for high speed microc- 
ode memory. 

The resulting dilemma between the need for parallel 
control and the cost, size, and power that accompanies 
it, is the basis of many a system designer's headache. 

The usual approach used to strike a balance between the 
opposing issues is to determine carefully which functions 
must absolutely be able to occur in parallel, then to limit 



the microcode word size to that absolute minimum. 
Control over other less frequently used functions or over 
alternate operations is then overlapped with the primary 
control fields. 

Overlapping of control fields means that during certain 
operations, the meaning of the bits in the overlapped 
control field changes. The hardware controlled by the 
primary meaning of an overlapped field must be dis- 
abled during the time that the alternative meaning is in 
effect. This of course means that the functions con- 
trolled by the overlapped fields cannot occur In the 
same machine cycle. 

This results in winning a little and losing a little. More 
control and thus more functions may be managed with 
less control memory, but some operations then take 
multiple cycles to complete, due to the use of functions 
that may not be managed in one instruction. Also, the 
need to enable and disable control field meanings and 
the associated hardware, will add control bits and decod- 
ing logic. The decode logic adds delay Into the machine 
cycles and will cause the system to run a little slower. 

Additional savings in control word size may be made by 
encoding fields rather than having each bit directly drive 
a control signal. This again adds decoding logic and its 
associated delay. 

The job of deciding what control must be parallel and 
what must be overlapped is more art than science. No 
matter how the microcode word is defined, there will 
always be other interesting ways to rearrange and over- 
lap the control fields. Each way will cost something either 
in word width or control decoding, thus providing endless 
trade-offs. 

All these possible variations make it extremely important 
to have a thorough understanding of the algorithms to be 
handled by a particular machine. The better the under- 
standing, the better the chance to optimize the system 
architecture and control to solve the problem at hand. 

Microcode Word Field Descriptions 

Throughout the figures that detail the design of this 
system, signals that travel from page to page have been 
given meaningful names that imply the function of the 
signal. This helps In understanding what is going on in 
each figure. Many of these signals are the direct outputs 
of the control store pipeline register. As it turns out, many 
of the bits in the microcode carry multiple meanings 
because the function of several fields are overlapped to 
save microcode word size. 
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The result is that more that one signal name may often be 
associated with a particular bit of the control pipeline. 
Physically, of course, all signal lines that ultimately con- 
nect to a particular pipeline bit are one piece of wire. The 
logical separation of lines, by using different names, only 
helps to understand the function of a given signal, when 
the hardware that uses the signal is enabled. The follow- 
ing three Figures show the physical and logical relation- 
ships between the microcode control store bits and the 
signal names (meanings) that are attached. 

Each Figure is split into pairs of columns preceded by 
one column that indicates the Individual bit numbers for 
each signal. Each column pair contains a Field Name 
column that describes the function of the bit and a Signal 
Name column that gives the signal name used through- 
outthe Figures in thisdocumentforthat meaning. The left 
most column pair shows the primary meaning of the 
control bits. Other column pairs to the right give alternate 
(overlapped) meanings for the control bits along with the 
signal name used with each meaning. 

Unless a control bit is overlapped with an alternate 
meaning in one of the columns to the right, the function 
of the control bit is constant. 

Register File Controls 

Figure 5-18 shows the microcode word bits that affect 
the Am29334 register file. 

It was decidedthat a three address machine would be the 
most appropriate way to obtain the best performance 
from the Am29300 family components. Because of the 
common three bus architecture these parts share, a 
three address register file fits nicely. Two addresses are 
used to read an A and B operand from the file while the 
third address specifies an independent write location. 
This allows writing back results without requiring the 
destruction of one of the read operands in a single cycle. 

An address multiplexer on the C operand register ad- 
dress does allow for two and one address operations by 
allowing either the A or B operand address to be used for 
the write operand address in addition to its use as a read 
operand. 

Also, to support macroinstnjction execution, address 
multiplexers are used on the read addresses so that 
macroprogram supplied register addresses may be di- 
rected to the register file. When macroprogram supplied 
addresses are in use, the meaning of the register ad- 
dress fields changes to control signals for the macro 
operand address counters. With this alternate meaning, 
the macro addresses may be incremented or decre- 
mented at the end of each cycle. 



Bits 91 and 84 select whetherthe microcode orthe macro 
opcode addresses are directed to the register file. If 
either bit is high, the alternate definition for the related 
address field takes effect, and the macro opcode address 

is used. 

Bits 76 and 77 are used to select one of four addresses 
to be supplied to the A write port of the register file. The 
selections are as follows: 

Bit 
77 76 

C operand microcode address used. 

1 A operand address, as specified by bit 91 . 

1 B operand address, as specified by bit 84. 
1 1 C macro operand counter address used. 

When any selection other than for the C operand microc- 
ode address is made, the field assumes the alternate 
meaning for control of the macro operand counter. 

In addition to the three addresses used by the data 
section of the CPU, a fourth address is provided for the 
B write port of the register file so that data may be moved 
into thefile via the second port while othercalculations go 
on undisturbed. 

The address for this fourth port comes from a multiplexer 
that may select either the C operand microcode address 
orthe C macro opcode address counter as the source. Bit 
69 Is the select input for this fourth address multiplexer. 

Bit 68 enables the register file A read port onto the 
A_BUS. If this bit is inactive and if the FPP seed register 
output is also inactive, the D_BUS to A_BUS transceiver 
is enabled so that constants, masks, and variables may 
be passed from the D_BUS to A_BUS. 

Bits 67 and 66 are used as the write enable controls for 
the two write ports of the register file. 

Data Patti Controis 

The data path controls are shown in Figure 5-19. 

To provide a straightforward example of the usage of the 
PM and FPP, these devices have had their input and 
output buses paralleled with those of the ALU. In this 
arrangement it Is not generally feasible to make use of 
more than one module in a given cycle. This is because 
the data buses may carry useful information to only one 
device at a time (this assumes that passing the same 
data to more than one device is of limited use). Also, only 
one device may drive the Y_BUS at a time. 
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Figure S-18. Am29300 Demonstration System Microinstruction Word Layout - Register File Controls 



Control 
Pipeline 
Bit# 



Primary 
Field Name 
Meaning 



Prirrary 
Signal Name 



Alternate 1 
Field Name 
Meaning 



Alternate 1 
Signal Name 



Alternate 2 
Field Name 
Meaning 



Alternate 2 
Signal Name 



P91 



P90 
P89 
P88 
P87 
P86 
P85 

P84 



Reg A Macro/Micro* P_ARA_MAC 
If P91 =0 then primary 

Register A Address (5) P_RA 

Register A Address (4) P_RA 

Register A Address (3) P_RA 

Register A Address (2) P_RA 

Register A Address ( 1 ) P_RA 

Register A Address (0) P_RA 

Reg B Macro/MlCTO* P_ARB_MAC 
If P84 = then primary 



If P91 = 1 then alternate 1 



(5) 
(4) 






(3) 
(2) 
(1) 
(0) 






RA Count Direction 
RA Count Enable 


P_UP/DN_A 
P_CNTA_EN 



If P84 = 1 then alternate 1 



P83 


Registers Address (5) P_RB 


(5) 




P82 


Register B Address (4) P RB 


(4) 




P81 


Register B Address (3) P RB 


(3) 




P80 


Registers Address (2) P RB 


(2) 




P79 


Registers Address (1) P RB 


(1) 


RB On. int Direction P UP/DN 8 


P78 


Register B Address (0) P_RB 


(0) 


RB Count Enable P CNTTB EN 


P77 


Reg C Add Source ( 1 ) P C SEL 


(1) 




P76 


Reg C Add Source (0) P_C_SEL 


(0) 






KP77:76 = 00 then primary 




If P77:76 = 01 , 1 0, 1 1 then alternate 1 


P75 


Register C Address (5) P_RC 


(5) 




P74 


Register C Address (4) P_RC 


(4) 




P73 


Register C Address (3) P_RC 


(3) 




P72 


Register C Address (2) P RC 


(2) 




P71 


Register C Address ( 1 ) P RC 


(1) 


RC Count Direction P UP/DN C 


P70 


Register C Address ( ) P_RC 


(0) 


RC Count Enable P CNTC EN 


P69 


B Write Port Select P AWB MAC 




P68 


A Bus Output Enable* P OEA* 






P67 


A Port Write Enable' P WEA* 






P66 


B Port Write Enable* P WEB* 
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Figure 5-19. Am29300 Demonstration System Microinsturction Word Layout - Data Patli Controls 


Control 


Primary 




Primary 




Alternate 1 




Alternate 1 




Alternate 2 




Alternate 2 


Pipeline 


Field Name 




Signal Name 




Field Name 




Signal Name 


Reld Name 




Signal Name 


Bit# 


Meaning 








Meaning 








Meaning 






P65 


Data Path Select 


(1) 


P DPS 


(1) 
















P64 


Data Path Select 


(0) 


P_DPS 


(0) 
















ALU when P65:64 = 00 








FPP when P65:64 = 


10,11 






PM when P65:64 


= 01 


P63 


ALU Instruction 


(8) 


P ALU INST 


(8) 


FPU Instnjction 


(4) 


P FP 1 


(4) 


TCX 




P TCX 


P62 


ALU Instruction 


(7) 


P ALU INST 


(7) 


FPU Instruction 


(3) 


P FP 1 


(3) 


TCY 




P TCY 


P61 


ALU InstracSon 


(6) 


P ALU INST 


(6) 


FPU Instruction 


(2) 


P FP 1 


(2) 


ACC 


(1) 


P ACC ( 1 ) 


P60 


ALU Instruction 


(5) 


P ALU INST 


(5) 


FPU Instruction 


(1) 


P FP 1 


(1) 


ACC 


(0) 


P ACC(O) 


P59 


ALU Instruction 


(4) 


P ALU INST 


(4) 


FPU Instnirtinn 


(0) 


P FP 1 


(0) 


RND 




P RND 


P58 


ALU Instruction 


(3) 


P ALU INST 


(3) 


ENR* 




P ENR* 




XSEL 




P XSEL 


P57 


ALU Instruction 


(2) 


P ALU INST 


(2) 


ENS* 




P ENS* 




YSEL 




P YSEL 


P56 


ALU Instrucbon 


(1) 


P ALU INST 


(1) 


ENP 




P ENF* 




TSEL 




P_TSEL 


P55 


ALU Instruction 


(0) 


P ALU INST 


(0) 


Feed Through 


(1) 


P FP FT 


(1) 


ENXA* 




P ENXA* 


P54 


Position Mac/Mic* 




P POS MAC 




Feed Through 


(0) 


P FP FT 


(0) 


ENXB* 




P ENXB* 


P53 


Position 


(5) 


P POSITION 


(5) 


IEEE/DEC* 




P IEEE/DEC* 


ENYA* 




P ENYA* 


P52 


Position 


(4) 


P POSITION 


(4) 


Seed Output Enalile 




P SEED OE 


ENYB* 




P ENYB* 


P51 


Position 


(3) 


P POSITION 


(3) 


Projecdve/Affine 




P PROJ/AFF* 


ENP* 




P ENP* 


P50 


Position 


(2) 


P POSITION 


(2) 


Rounding Mode 


(1) 


P FP RND ( 1 ) 


ENT* 




P ENT* 


P43 


Position 


(1) 


P POSITION 


(1) 


Rounding Mode 


(0) 


P_FP_RND(0) 


FA 




P_FA 


P48 


Position 


(0) 


P POSITION 


(0) 






FTX 




P FTX 






P47 


Width Mac/Mic* 




P WID MAC 








FTY 




P FTY 






P46 


Width 


(4) 


P Width 


(4) 






FTP 




P FTP 






P45 


Width 


(3) 


P Width 


(3) 






PSEL 


(1) 


P PSEL 


(1) 




P44 


Width 


(2) 


P Width 


(2) 






PSbL 


(0) 


P PSEL 


(0) 




P43 


Width 


(1) 


P Width 


(1) 
















P42 


Width 


(0) 


P Width 


(0) 
















P41 


Macro/Mksro* Status 


P MIC/MAC 


















P«) 


Register Status 




P REG STAT 


















P39 


Load Macro Status 




P LD MAC STAT 
















P38 


Borrow Mode 




P BM 


















P37 


Memory Add Select (3) 


P MEM 


(3) 
















P36 


Memory Add Select ( 2 ; 


P MEM 


(2) 
















P35 


Memory Add Select ( 1 ; 


P MEM 


(1) 
















P34 


Memory Add Select (O; 


P MEM 


(0) 
















P33 


Memory Write En* 




P MEM WR* 
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Figure 5-20. Am29300 Demonstration System Microinstruction Word Uyout - Control Section Controls 


Control 


Primary 




Primary 




Altemate 1 


Altemate 1 


Altemate 2 Altemate 2 


Pipeline 


Field Name 




Signal Name 




Field Name 


Signal Name 


Field Name Signal Name 


Bit# 


Meaning 








Meaning 




Meaning 


P32 


Cyde Lengti 


(1) 


P CLK LEN 


(1) 








P31 


Cyde Length 


(0) 


P_CLK LEN 


(0) 








P30 


Interrupt Enable 




P_iNT_EN 










P29 


Force Continue 




P_FC* 












If P29 = 1 then primary 






If P2g = then altemate 1 






P28 


Seq Instruction 


(5) 


P SEQ INST 


(5) 


intemipt Host 


P INT HOST 




P27 


Seq Instruction 


(4) 


P SEQ INST 


(4) 


Sign Extend A BUS 


P SIGN EX 




P26 


Seq Instruction 


(3) 


P SEQ INST 


(3) 


Initialize 


P INIT 




P25 


Seq instruction 


(2) 


P SEQ INST 


(2) 


Load Intemipt Base Add 


P LD INT BASE 




P24 


Seq Instmction 


(1) 


P SEQ INST 


(1) 








P23 


Seq instruction 


(0) 


P SEQ INST 


(0) 










If P29 = 1 AND P28:27 1= 1 1 then primarv 




If P29 = OR P28;27 = 1 1 then altemate 1 




P22 


Test Select 


(3) 


P TEST 


(3) 


Am29114lnEtruction(3) 


P INT INST 


(3) 


P21 


Test Select 


(2) 


P TEST 


(2) 


Am29114 Instruction (2) 


P INT INST 


(2) 


P20 


Test Select 


(1) 


P TEST 


(1) 


Am29114instnjclion(1) 


P INT INST 


(1) 


P19 


Test Select 


(0) 


P TEST 


(0) 


Am29114lnstrucHon(0) 


P INT INST 


(0) 


P18 


Load Operand Counter P LD CNT 










P17 


Load Macro Op Reg 


P_LD_MAC_OP 








P16 


Branch Field Enable* 


P BRANCH EN* 








P15 


Branch Address 


(15 


D BUS (15) 










P14 


Branch Address 


(14 


D BUS (14) 










P13 


Branch Address 


(13 


D BUS (13) 










P12 


Branch Address 


(12 


D BUS (12) 










P11 


Branch Address 


(11 


D BUS (11) 










P10 


Branch Address 


(10 


D BUS (10) 










P9 


Branch Address 


(9) 


D BUS(9) 










P8 


Branch Address 


(8) 


D BUS(8) 










P7 


Branch Address 


(7) 


D BUS(7) 










P6 


Branch Address 


(6) 


D BUS(6) 










P5 


Branch Address 


(5) 


D BUS (5) 










P4 


Branch Address 


(4) 


D BUS(4) 










P3 


Branch Address 


(3) 


D BUS (3) 










P2 


Branch Address 


(2) 


D BUS (2) 










PI 


Branch Address 


(1) 


D BUS ( 1 ) 










PO 


Branch Address 


(0) 


D BUS(O) 











If separate control bits were provided for the FPP or PM, 
they could perform multi-cycle operations such as New- 
ton- Raphson division in the FPP or greater than 32 by 32 
bit multiplies in the PM, while remaining detached from 
the input and output buses during most of the multi-cycle 
operation. If this were done, the ALU could operate in 
parallel during such operations. The cost of doing this 
would be an additional 15 to 35 bits added to the microc- 
ode word width. These bits would get full use only during 
those situations that parallel calculations are possible. 

For this design it was decided to use a smaller microcode 
word by overlapping control bits for each of the three 
functional units. 



Data Path Selection: Only one functional unit (data 
path) in the data section is chosen in any one cycle. Bits 
65 and 64 select one of four options: 



Bit 



65 



64 









ALU enabled 





1 


PM enabled 


1 





FPP enabled 


1 


1 


Special function 



6-70 



CHAPTER 6 
Artlcles/AppMcatiQn Notes 



In the special function option, the FPP is enabled for 
calculation and the control bits are assumed to be set 
correctly for use by the FPP, but the output enable of the 
FPP is Inactive with the ALU output enable active. The 
ALU is not enabled for calculation in the sense that its 
hold input is made active to prevent state change in the 
status or Q registers. 

This odd-looking combination is used to provide input 
operand parity checking for the FPP. The FPP does not 
have its own parity checking circuits, so with this arrange- 
ment the ALU parity checkers will be enabled by the 
active output enable on the ALU. The FPP is still allowed 
to function and may complete its operation and store the 
result in its Internal registers, while In the same cycle the 
input operand parity is checked by the ALU. The ALU 
state is left undisturbed by this operation. 

How useful is this scheme? It may save a cycle once In 
a while, but mainly it illustrates the odd sort of opportuni- 
ties one may find to use up an otherwise wasted control 
code. 

ALU Path: When the data path select bits enable the 
ALU meaning for bits 63:38, bits 54 and 47 are used to 
select either the microcode or macroinstruction position 
and width fields. The macro supplied information is 
selected when these select bits are high. When the 
macro source is selected, the microcode position and 
width fields are unused. 

Bit 41 selects macro or micro status inputs for the ALU. 
Bit 40 selects whether the status output of the ALU is 
flow-through or registered. 

Bit 39 is used as a clock qualifier for the loading of the 
ALU externa! macro status register. 

Bit 38 directly controls the Borrow mode of the ALU. 

FPP Path: When the data path selects enable the FPP, 
the control bits shown directly manage the operation of 
the FPP as described by the Am29325 data sheet. Bit 52 
is used to enable the output of the FPP external "division 
seed" registered PROM. 

PM Path: When the data path selects enable the PM, the 
listed control bits are used as defined in the Am29C323 
data sheet. 

Data Path Enabling: What does it mean to enable or 
disable one of the functional units? The control bits that 
are shared between each functional unit are either high 



or low every cycle, and they are connected to the ALU 
and multipliers ail the time. There is no intervening logic 
that turns all the control bits "off" when a particular path 

is not selected. Each device sees a jumble of nonsense 
on its control lines whenever the control field meaning is 
intended for another device. Nonsense or not, each 
device will do whatever the control bits specify. 

Enabling a data path means making the output enable of 
the selected device active so that it drives the Y_BUS and 
is able to write calculation results back into the register 
file. In the case of the ALU, enabling also means that the 
ALU hold input will be made inactive so that state change 
of the ALU status and Q registers Is allowed. Enabling 
one path Implies disabling the other paths. 

For the PM and FPP, disabling means their output 
enables are inactive. It also means that the PM product 
register feed through pin is disabled by the control 
decode logic. FortheFPPit means that both of its register 
feed through lines are disabled by control decode logic. 
These register feed through controls are disabled be- 
cause, if they are allowed to be active, it is possible for the 
PM and FPP multipliers to feedback on themselves and 
begin to oscillate. This action would not damage the 
devices, but it could add to power consumption and 
system power plane noise. A simple prevention is just to 
disable the feed-throughs when the data paths are not 
selected. Note that the ALU has no internal feedback 
paths and does not need any similar treatment. 

Memory Control : Bits 37:33 are available at all times to 
control the Am29300 system memory. 

Bit 33 is the memory write enable control. 

Bits 35:34 select the source of the address for the 
memory. 



Bit 



35 



34 



No memory address or operation is 

selected 

1 A_BUS data is used to address memory 

1 The A memory address counter is 

selected for address 
1 1 The B memory address counter is 

selected for address 
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Bits 37:36 select the following: 
Bit 



37 




1 
1 



36 


1 



1 



Load counter A 

Load counter B 

Selected counter is incremented 

Selected counter is decremented 



The increment and decrement commands have effect 
only when a counter is selected as the M A_BUS source. 
The load commands have effect onlywhenthe A_BUS is 
the selected source. 

Control Section Controls 

Figure 5-20 shows the bit definitions for the control 

section. 

Pipeline bits 32:31 control the length of each machine 
cycle. 



Bit 



32 



31 












1 


1 





1 


1 



Normal cycle length 
1 .33 X Normal cycle length 
1 .66 X Normal cycle length 
2 X Normal cycle length 



Bit 30 enables sequencer interrupts on a cycle by cycle 
basis. 

Bit 29 is the Force Continue signal for the sequencer. 
When this bit is active, the sequencer will execute a 
continue instruction regardless of the slate of the se- 
quencer Instruction or test select lines. This effectively 
enables the alternate meaning forlhe sequencer instruc- 
tion and test select fields. 

Bits 28:19 are normally the sequencer instruction and 
test select inputs. When Force Continue is active, the 
sequencer instruction field meaning changes. 

When Force Continue is active, bits 28:25 are used to 
control four individual functions. Bit 28 will send an 
interrupt signal to the host system. Bit 27 will enable the 
sign extension of data going from the D_BUS to the 
A_BUS. Bit 26 will force the control pipeline register to 
load data from the control store initialize register at the 
next active system clock. Bit 25 will enable the loading of 
the internjpt base address register. 



Bits 22:19 are used to control the sequencer test selec- 
tion. When an unconditional sequencer instruction is in 
effect or when the Force Continue bit is active, bits 22:19 
are used to control the Interrupt controller instruction. 

Bit 18 is used to load the macro operand counters from 
the macro opcode register. 

Bit 17 is used to load the macro opcode register. 

Bit 1 6 enables the three-state outputs on the branch field 
bits of the control pipeline register. If these outputs are 
disabled, then the sequencer, A_BLIS to D_BUS trans- 
ceiver, or Interrupt Controller may drive the D_BUS. How 
a device is chosen to drive the D_BUS is explained in the 
control decode logic description. It is only important to 
note that if bit 1 6 is active, the branch field outputs will be 
active and will have priority over any other driver on the 
D_BUS. 

Bits 15:0 are the branch address field to the sequencer. 
This field is also used to contain constants or masks. 
These may be used by the data section, sequencer, 
interrupt base register, or interrupt controller. It is a full 1 6 
bits long in order to allow for constants or masks that fill 
half of the 32-bit data path. This allows 32-bit microcode 
supplied masks to be formed with two microinstructions. 

Alternate Arrangements 

The microcode word size just defined for this system 
totals 92 bits wide. Having so many bits allows the 
flexibility to change the control over most of the 
machine's functions on any or every cycle. But, this 
degree of control flexibility is not required for every 
application. The size of the control store may be reduced 
based on how the system Is used most often. Following 
are afewcommentson waysto rearrange and reducethe 
control store size. 

Current Control Bit Usage 

First let's look at how the control bits are used in this 
design. 

Seven of the bits are used to control the selection of 
alternate field meanings (i.e., overlap control in bits 91, 
84, 77:76, 65:64, and 29). 

Eleven bits are used to control functions that are desired 
to operate in all cycles, independent of other system 
operations. These are the register file write and read 
enables (bits 69:66), memory controls (bits 37:33), and 
the cycle length controls (bits 32:31). 
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Eight bits generally do not change state trequently. Their 
existence in this design is a convenience that reduces the 
need for control decode logic and adds system flexibility. 
These bits are 41:38, 30, 18:16. 

Three bit fields are used only with some instruction 
types. These are the position, width, and branch fields. 
Whenever a particular instruction does not use a field, 
those bits in the field are currently wasted in that in- 
struction cycle. 

Alternative Usage 

The bits that change infrequently could be replaced by 
decode logic that provides these same control signals via 
set-reset flip flops. The flip flops would be controlled by 
overlapping set and reset commands with some other 
control store field. This would add to the decode logic 
complexity and would limit when the flip flops could be 
changed by restricting the control over them to certain 
instruction types. Since they change only infrequently, 
the requirement to use certain instruction types when 
setting or resetting them should not be a problem. 

Those bitfieldsthat are limitedtocertain Inst njctiontypes 
could be overlapped. An example might be to overiapthe 
position and width fields with the branch address field. 
This would restrict branches to instructions that do not 
require the position or width information. 

When alternative field meanings are enabled, often the 
alternative definition does not make use of all the bits in 
the field. This presents the opportunity to overlap other 
control bits that may be valid in the same cycle as the 
alternate meaning of the field. 

For example, some of the infrequently-used control bits 
could be overlapped with the unused bits of the register 
C address when the primary meaning of the C address 
field is not active. When a two address instruction is 
executed, the address for the C register comes from the 
A or B address, thus leaving the microcode field for the C 
register address available for other functions. 

In another example, the bits in the position and width 
fields that are not used by the PM or FPP could be 
overlapped with other control functions when the alter- 
nate meanings for the field are in effect. An alternate 
branch addressfield might be placed in those bits to allow 
branch instructions in combination with FPP or PN/1 
operations without the need for the currently defined 
branch field. 

Careful analysis of how each data path is used may also 
allow reductions through the elimination of controls that 
are not needed. As an example: if the PM were used 



only in flow through mode, all the controls for register 
enables, flow through modes, and input multiplexers 
could be removed from the microcode word and those 
inputs to the PM tied to fixed vpltage levels. If only two's 
complement mode is used then an additional two bits 
may be eliminated. This would leave only four necessary 
control bits, the accumulator controls, rounding mode, 
and format adjust. This reduction might allow PM 
operations to be overlapped with some multiply-accumu- 
late operations in the FPP. 

By combining these reduction techniques, the following 
changes could be made: 

All of the eight infrequently used control bits could be 
moved to overlap with the C register address, with half in 
effect when the A address is substituted forthe C address 
and half in effect when the B address is substituted. 

The PM controls, except for flow though and two's 
complement mode, may be moved to overlap with the 
position, width, and memory control fields. Also, the 
fourth data path select field may be changed to disable 
the memory controls and select the ALU — minus the 
position and width fields — to be active along with the PM . 
In this mode the PM flow through and two's complement 
mode controls would be fixed with no flow through and 
two's complement mode active. The ALU position and 
width inputs would be set to and 31 respectively by 
control decode logic (unlessthese fields were selectedto 
come from the macro opcode). 

The branch address field may be moved to overlap with 
the position, width, and memory control fields. When ever 
the sequencer instruction selects a branch operation, the 
position, width, and memory fields are disabled and the 
branch address meaning substituted. 

If all of these changes are made, the currently defined 
branch address field and infrequently used control bits 
may be eliminated, which would save 24 bits of microc- 
ode word width. This would reduce the word size to 68 
bits. 

This savings would come at the cost of allowing branch 
instructions only when the ALU instmction does not need 
position or width information from the microcode (this 
information may still come from the macro opcode regis- 
ter) and when the system memory is not being used. 
Further, a PM operation could not occur with a memory 
access in the same cycle. Also, with these changes it 
would be possible to control the ALU and PM concur- 
rently when the ALU does not need position or width 
information and when the PM operates on internally 
registered data. 
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There are many such combinations of microcode control 
field definition. Each one provides a different trade-off 
between word size and what operations may be concur- 
rent. Each one requires a different degree of complexity 
in the control decode logic. 

CONTROL DECODE 

What Is It Good For? 

The ideal microprogrammed system has a separate 
microcode control store bit for each control input that 
exists in the system. This kind of complete control over 
every aspect of the system directly from the control 
pipeline totally eliminates the need for decoding the 
meaning of any system control bits. It also requires a very 
large microcode word to manage most useful systems. 
So in the real world, most microprogrammed systems 
encode or overlap at least some control functions in the 
microcode word. 

Encoded control or not, each control input in the system 
requires valid voltage levels during each machine cycle 
if the system is to operate as expected. 

The control decode logic acts as the bridge between 
encoded or overlapped (i.e., sometimes unavailable) 
microcode control fields and the related control signals in 
the system. The control decode logic continuously pro- 
vides valid logic levels for those control signals that 
cannot be directly driven by the control pipeline register. 

If the CO ntrol field for a particular function Is encoded , the 
control logic translates the function codes into individual 
control signals. Where control fields are overlapped, the 
control logic may be used to disable logic affected by a 
control field when that field has a meaning different tfian 
that intended for the logic being disabled (i.e., when 
overlapped control is active). 

In some cases, control logic is used to prevent harmful 
conflicts between the meaning of different control bits, for 
example when two separate control fields affect the 
three-state enables on different buffers which may drive 
the same signal line. Certain combinations of control bits 
might enable both buffers in the same cycle causing 
contention between the buffers. Allowed to continue for 
long periods, this kind of contention may destroy the 
buffers. Control logic may be used in this situation to 
disable one or both buffers when the combination of 
controls affecting them would othenwise cause damage. 
In fact it is strongly recommended that this kind of 
problem always be avoided by designing the control 
decode logic to prevent such disasters. The alternative is 
to watch hardware melt because of a software mistake. 



Control Logic Description 

Some of the control logic function in this demonstration 
system has been distributed into the devices being 
controlled. This is done when a PAL is used to implement 
a function. A PAL generally has excess inputs and 
internal logic that may be put to use in decoding the 
meaning of encoded control fields{ e.g. the memory 
address counters). The memory address counters are 
implemented from AmPAL22V1 devices and are shown 
in Figure 4-7. The control for loading, incrementing, 
decrementing, and output enabling the counters is pro- 
vided directly from the encoded memory control field. 
The PALs internally decode the meaning of the control 
bits. 

When a device requires a decoded control signal, the 
signal must come from control decode logic that takes 
control pipeline bits as input and produces the needed 
control signal. In this system, the required control logic 
has been implemented in three AmPAL18P8B PALs. 
These PALs are fast to minimize the delay induced 
between the control pipeline register and the device 
controlled. The PALs also provide the convenience of 
having programmable output levels, either high or low 
active for each output, independent of other outputs. 

The block diagram for these PALs is shown in Figures 5- 
21 and 5-22. The logic definition files for these PALs are 
in Appendix M. 

The ALU output enable, ALU hold, and PM output enable 
are all direct results of the pipeline data path select bits. 

The pipeline controls for seed register output enable, PM 
flow through, and FPP flow through are gated by the 
appropriate data path selection so that each control 
signal is active only when the related data path is se- 
lected. 

The D_BUS to A_BUS direction of the D_BUS trans- 
ceiver is enabled by the register file A output's being 
disabled in conjunction with the seed register output's 
being disabled. 

The A_BUS to H^D_BUS buffer is enabled by certain 
codes of the memory control field. 

The control store initialize register select is enabled by 
the combination of the pipeline Force Continue and the 
pipeline control bit for the initialize select. It is also 
enabled by the WCS_IN1T* signal from the host interface 
controller. Note that the initialize control is synchronous 
as used in this system so that the initialize word is loaded 
only at the next active clock. 
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Figure 5-21. Control Decode Logic Parti 
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Figure 5-22. Control Decode Logic Part 2 



6-75 



CHAPTER 6 

Articles/ Application Notes 



The D_BUS sign extend, Sequencer output enable, 
Interrupt controller instruction and chip select enables, 
and A_BUS to D_BUS enable are all direct results of the 
pipeline sequencer instruction, interrupt controller in- 
struction, branch enable, and Force Continue bits. 

The Sequencer output enable, A_BUS to D_BUS en- 
able, and interrupt controller chip select are used to 
control which device is allowed to drive the D_BUS in any 
given cycle. These output enables are arranged in a 
priority with only one output allowed to be active in any 
cycle; including the branch field of the control pipeline. 

The highest priority output is the branch field. If it is 
enabled all other outputs are disabled. 

If the branch field is disabled, then the Sequencer D 
output Is enabled if either a Continue or a Pop D instruc- 
tion is being executed. 

If neitherthe branch field northe sequencer are enabled, 
then the interrupt controller may drive the D bus if the 
interrupt controller instruction is one of three read 
operations. 

If none of the above conditions exist to enable the other 
D_BUS devices, then the A__BUS to D_BUS transceive 
path is enabled. 



Note that the interrupt controller chip select is treated as 
both an instruction enable and as an output disable. The 
chip select is active whenever there is a valid intermpt 
instruction that would not cause a conflict with another 
driver of the D_BUS. This means that when there is a 
valid instruction, the chip select will be inactive only if a 
read instruction is selected and eitherthe branch field or 
sequencer are already enabled on the D_BUS. If any 
other intermpt instruction is in effect, the interrupt control- 
ler does not drive its outputs. 

The above scheme for managing the access rights to the 
D_BUS may seem a bit complex but it allows great 
flexibility in movement of information over the D_BUS. 
Information maybe moved between the interrupt control- 
ler and sequencer, interrupt controller and A_BUS, or 
sequencer and ABUS. information may be loaded into 
the interrupt base address register from the pipeline, 
sequencer, or A_BUS. Also, the pipeline may provide 
data to the sequencer, interrupt controller, interrupt base 
address register, or A_BUS. 
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SECTION 6 

System Timing and Critical Path Analysis 



DEFINITIONS 

The upper limit on system speed Is set by tfie slowest 
signal propagation path In the system. 

The length of a signal propagation path is measured from 
the output of one registerto the input of another register, 
where all registers are loaded by the same clock. 

The slowest signal path will be different for different 
control states. An example would be the selection of the 
ALU data path vs. the FPP data path. 

A signal path may be slower in the first cycle that control 
selects the path than it will be in a subsequent cycle that 
maintains the same path selection. This can be due to 
three-state enable or disable times being longer than 
normal propagation delays of the circuits involved. 



CONTROL AND DATA PATHS 

In determining the maximum system speed, every signal 
path must be analyzed. This requires tracing every 
control signal and every data signal and totaling the delay 
induced by each component along the path from source 
register to destination register. Where parallel paths 
exist, the time delay for the slowest path is used. 

I^ost often, the critical (slowest) paths originate with the 
pipeline control register. In the data section the paths will 
end with data being loaded into the register file, an FPP 
or PM internal register, the system memory, or a D_BU§ 
destination. In the control section the paths will end with 
loading of new control bits into the control pipeline 
register. 
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Figure 6-1. Data Section Timing Paths 
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Figure 6-2. Control Section Timing Paths 



Since the control section and data section operate in 
parallel, the slowest path in either section will determine 
the cycle length required for a specific operation. 

Figures 6-1 and 6-2 provide a block diagram view of 
significant signal pathways for both control and data lines 
in both the control and data sections. 

Referring to these figures as critical timing paths are 
discussed may help in following the timing analysis. 

In this and nearly any complex system, there are hun- 
dreds of pathways that must be traced in order to ensure 
finding all the worst case delays. To go through all of them 
here would require too much time and space. Many of the 
timing paths for this design have already been analyzed, 
and what appear to be the worst case paths will be shown 
here. 



WORST CASE PATHS 

Each case is shown in Table 6-1 . The table is separated 
into several pages due to its length. It can be viewed as 
a long spreadsheet calculation in which the appropriate 
timing parameters that apply to each case have been 
selected and placed in the correct column. Only the worst 
case delay for each segment of a critical path is shown. 
Parallel but faster paths have been eliminated from each 
case so that the total of the times listed for a case 
represents the minimum time in which a path can be 
traveled. 



Case Definitions 

1 . Basic flow-through calculation, data path. 

Data is moved from the register file through the 
ALU and back to the registerfile. The timing path 
begins at the control pipeline where the register 
file address for the A and B read operands 
appear after the clock to output delay of the 
control pipeline register. These addresses flow 
through the Am29827 buffer that forms one side 
of the register file address multiplexer. The 
address accesses the register file and one ac- 
cess time later the data operands are presented 
to the ALU . By this time the control signals for the 
ALU instruction have been stable long enough 
that the flow through time of the data in the ALU 
will be the slower path. Once data is on the Y bus 
the last delay is the set-uptime for the registerfile 
before clock can occur. Again, the control path to 
the register file (A port write address) Is faster 
than the data path so the data path is the limiting 
factor. 

The total delay for this path is 96 ns. If the PM 
path is substituted for the ALU the delay would 
be 174 ns. If the FPP were substituted, the delay 
is 179 ns. So flow through calculations with 
either of the multipliers will require extended 
cycle length. 
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Basic flow-through calculation, position control 
path. 

This case is the same as Case 1 except that a 
careful look at the control path for the position 
input to the ALU is taken. This path turns out to 6. 
be 97 ns worst case. This is an example where 
the control path is a little slower than the data 
path. 

Flow-through calculation with address supplied 
by the Macro operand counter; counter output 
enabled same cycle. 

Again this path is similar to Case 1 . The differ- 
ence is that the read addresses are assumed to 
come from the Macro operand counters. It is 7. 
further assumed that the counters are selected 
during the cycle analyzed. This means that the 
output enable time of the counter must be added 
to the clock to output time for the pipeline bit that 
selects the macro opcode counter. 

This increases the delay path to 1 15 ns, indicat- 
ing that during the first cycle, in which a macro 
opcode counter is used as the address source, 
the cycle length will need to be extended. 

Flow-through calculation with address supplied 
by the Macro operand counter; counter output 
enabled prior cycle. 8. 

This case is a comparison with Case 3, where 
the Macro operand counter was output enabled 
in the previous cycle. The counter delay is thus 
limited to the clock to output delay of the 
counter. This reduces the cycle time require- 
ment to 90 ns. So, sequential register file 
address cycles, using an operand counter can 
be completed within the normal cycle time. 

First cycle of FPP Newton-Raphson division, 
seed value load. 9- 

In this case the critical path starts at the control 
pipeline clock to output delay, and then goes 
throughthe control decode logic that enablesthe 
output of the Seed register. In this case it is 
assumed that the seed value is multiplied and 
stored in an FPP internal registerto complete the 
first cycle of a Newton-Raphson division. This 



requires a total of 169 ns. Note that if the seed 
value had simply been moved into the input 
register of the FPP, the total delay would have 
only been 73 ns. 

Memory read with address from the register file, 
selected by microcode. 

This is a simple memory read with the time 
starting at the pipeline clock to output delay, 
followed by the address mux, register file ac- 
cess, A_BUS to MA_BUS buffer, memory, and 
register file data set-up time. The total time 
comes in at 99 ns, just under the desired 1 00 ns 
basic cycle time. 

Menrory read with address from a memory 
address counter. 

Here the access time of the register file is essen- 
tially traded for the output enable time of a 
memory address counter. The total delay only 
improves to 94 ns, but there is a big advantage 
in the fact that for a sequential access the CPU 
did not need to calculate a memory address. 
This will save at least one cycle. Also, it is 
possible fo overlap a memory read from an 
address counter with a calculation cycle in the 
CPU. 

Memory write with data from register file, se- 
lected by operand counter. 

In a memory write case, time is saved by needing 
only to meet the data set-up time of the memory 
rather than the memory access time plus the 
register file set-up time, as would be the case in 
a read operation. In this case the time gained is 
traded for the time required to output enable an 
operand counter. Even so, the total time is still 
94 ns. 

Move register file data to interrupt controller or 
sequencer, data selected by operand counter. 

Here again, the long delay path of using a macro 
opcode counter as the register file address 
source is used. Even with the output enable 
delay of the counter in addition to the pipeline 
clock to output time, the total delay comes in at 
89 ns. 
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1 0. Move sequencer or interrupt controller data to 
register file. 

In the reverse of the above case, the time to get 
data from D_BUS is similar to the time in Case 9 
to access data from the register file. The big 
delay here is the need to move the data from the 
A_BUS, through the ALU and back to the regis- 
ter file. Not having a direct path to the Y_BUS has 
cost a good bit of time. The total comes In at 
1 27 ns. Fortunately this type of data move is 
not likely to be a commonly executed cycle. 

1 1 . Sequencerbranch, conditional or unconditional. 

In this case much of the delay is In the pipeline 
ctock to output time for the branch field enable 
bit, cascaded with the output enable time of the 
branch field In the control pipeline register. This 
is followed by the branch address flow through 
time of the sequencer and the access time of the 
control store. Even with all the delay, this path is 
significantly faster than most of the data section 
paths. The total time Is 84 ns. 

1 2. Sequencer interrupt or trap cycle. 

I n this case the pipeline output doesn't turn out to 
be in the main delay path. The interrupt starts at 
the clock to output delay of the trap logic where 
the Interrupt request Is generated. The se- 
quencer then responds with interrupt acknowl- 
edge, which in turn output enables bit 3 of the 
Intermpt vector from the trap logic. The interrupt 



vector then accesses the control store. The total 
forthlscycle Is81 ns. 

13. Sequencer branch to macro opcode specified 
instruction. 

Here the Initial delay is the clock to output delay 
of the macro opcode register, followed by the 
access time of the map RAM. Next is the branch 
flow through time for the sequencer and the 
access of the control store. This cycle comes in 
at 85 ns. 

FINAL RESULTS 

Several cases were shown here to help give an idea of 
how fast the system is for different instructions. These 
cases were some of the worst identified during the critical 
analysis of this design. All but three of the cases shown 
fit within the desired 1 00 ns basic clock cycle. Two of 
the cases would only require a cycle 1 1 /3 times normal. 
Case 5 officially needs a double length cycle. 

As noted In the discussion of Case 1 , both the PM and 
FPP require much longer cycles forflow through calcula- 
tions. If the PM and FPP are used In clocked multiply 
mode for sequential pipelined multiplies, as would occur 
in array calculations, the cycle time can be significantly 
reduced. In clocked multiply mode the PM or the FPP 
requires only 1 00 ns cycle times. 

With a dynamically variable clock cycle length, this sys- 
tem can run most Instructions at the basic 1 00 ns cycle 
rate, but will still handle the occasional extended execu- 
tion time instructions. 
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Am29300 Demonstration System 
Table 6-1 A 



Signal Path Timing Analysis 



Data Path Element 
Parameter Description 




Worst Case Time Delay in Nanoseconds, Over Commercial Operating Range 
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3 
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Case 
5 


Case 
6 


Case 

7 


Case 
8 
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9 


Case 

10 


Case 
11 


Case 
12 


Case 
13 




Control Store/Register - 
Am9151-50 
Clock to Output 
OE to Output Valid 
Synchronous! 
1 to Clock Set-up 
Address to Clock Set-up 


Tpkhdqvl 
Tgldqv 

Tivpkh 
Tavpkh 


15 
20 

25 
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15 


15 
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15 


15 


15 
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15 


15 


15 
20 

30 


30 


30 




Control Decode Logic - 
AmPAL18P8B 
Input to Output 


Tpd 
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Macro Opcode Register - 
Am29818-1 
Clock to Output 
Input to Clock Set-up 


Tpd 
Ts 
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Macro Operand Counters - 
AmPAL22V10A 
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Table 6-1 B 



Data Patti Element 
Parameter Description 




Worst Case Time Delay in Nanoseconds, Over Commercial Operating Range 




Symbol 


Value 


Case 

1 


Case 
2 


Case 
3 


Case 
4 


Case 
5 


Case 
6 


Case 
7 


Case 
8 


Case 
9 


Case 

10 


Case 

11 


Case 

12 


Case 
13 


Reg File B Write Add Mux - 
AmPAL22P10AL 
Input to Output 


Tpd 


25 




























ALU Position & Width Mux - 
AmPAL22P10AL 
Input to Output 


Tpd 


25 




25 
























Register File - Am29334 
Address to Read 
Data Output 
OE to Output Valid 
OE to Output Three-state 
Data Set-up 


Access 
Turn-on 
Turn-off 
Tds 


24 
20 
16 
9 


24 
9 


9 


24 
9 


24 
9 




24 
9 


9 


24 


24 
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ALU - Am29332 
Data A or B to Y Parity 
Instruction to Y Parity 
Width to Y Parity 
Position to Y Parity 




42 
53 
40 
48 


42 


48 


42 


42 












42 








Parallel Multiplier - 
Am29C323 

Unclocked Multiply X or Y 
to P Parity 
Clocked Multiply, 
Cycle Time 
Clocked Multiply, 
Data to Clock Set-up 
Clocked Multiply, 
Clock to Output 


Tmuc 
Tmc 
Tsxy 
Tpdpp 
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125 
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40 
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Table 6-10 



Data Path Element 
Parameter Description 




Worst Case Time Delay in Nanoseconds, Over Commercial Operating Range 




Symbol 


Value 


Case 

1 


Case 
2 


Case 
3 


Case 
4 


Case 
5 


Case 
6 


Case 

7 


Case 
8 


Case 
9 


Case 
10 


Case 
11 


Case 
12 


Cass 
13 


Floating Point Processor - 
Am29325 

Unclocked Operations 
Clocked Operation 
Clocked Multiply, 
Data to Clock Set-up 
Clocked Multiply, 
Data to Clock Set-up 


Tsdl 
Tsd2 


125 

too 

13 
104 










104 


















FPP Seed Register - 
Am2920 & Am27S25 
OE to Output Valid 


Tzh 


35 










35 


















FPP External Status 
Register -AmPAL22V10A 
Clock to Output 
Input to Clock Set-up 


Tco 
Ts 


15 
20 




























Macro Status Register - 
Am29818-1 
Clock to Output 
Input to Clock Set-up 


Tpd 
Ts 


11 
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Memory Address or 
Data Buffer -Am29827 
Input to Output 
OE to Output Valid 


Tph 
Tzh 


10 
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Memory Address Counters - 
AmPAL22V10 
Clock to Output 
Input to Clock Set-up 
OE to Output Valid 
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Data Path Element 
Parameter Description 




Worst Case Time Delay 


In Nanoseconds, Over Commercial Operating Range 












Symbol 


Value 


Case 

1 


Case 
2 


Case 
3 


Case 
4 


Case 
5 


Case 
6 


Case 
7 


Case 
8 


Case 
9 


Case 
10 


Case 

11 


Case 
12 


Case 
13 


Memory - Am99C1 65-35 
Chip Enable Access Time 
Address Access Time 
Chip Enable to 
Output Disable 
Write Pulse Width. 
Data to Write End Set-up 
Address to Write 
End Set-up 
Write to Output Disable 


Telqv 
Tavqv 

Thz 

TwIwh 

Tdvwh 

Tavwh 
TwIqz 


35 
35 

20 
30 
20 

30 
10 












35 


35 
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D_BUS - A_BUS 

Transceiver - Am29853 
Input to Parity Output 
OE to Output Valid 


Tpd 
Tzh 
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15 


















15 


15 








D BUS - A BUS Parity 
Buffer - Am29862 
Input to Output 
OE to Output Valid 


Tpd 
Tzh 
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Map RAM -Am91 50-25 
Address to Data 


Taa 


25 


























25 


Interrupt Controller - 
Am29114 

Clock to Interrupt Request 
Instruction Enable to 
Data Output 
Data In to Clock Set-up 
MINTA* to Vector OE 




41 

30 
10 
19 
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Trap Logic -AmPAL22V10A 
Clock to Output 
Input to Clock Set-up 
OE to Output Valid 
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Ts 
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Data Path Element 
Parameter Description 


Symbol 


Worst C 
Value 


Jase Tirr 

Case 
1 


e Delay 

Case 
2 


in Nanoseconds, Over Commercial Operating Range 

Case Case Case Case Case Case Case 
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Case 
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Case 
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Case 
13 


Sequencer - Am29331 
Branch Input to Y Output 
Instruction to Y Output 
Instruction to D Output 
Force Continue to 
Y Output 

Interrupt Request to 
Interrupt Ack 
OED to D Valid 




19 
25 
31 
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11 
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Am29300 Demonstration System Signal Path Timing Analysis 

Table 6-1 F 



Case Definitions 



1 . Basic flow through calculation, data path. 

Pipeline, Tec; Address Mux, Tpd; Register File, Tpd; ALU, Tpd; Register File, Set-up. 

2. Basic flow through calculation, position control path. 

Pipeline, Tco; Position Mux, Tpd; ALU, Tpd; Register File, Set-up. 

3. Flow through calculation with address supplied by operand counter; counter output enabled same cycle. 

Pipeline, Tco; Operand Counter, Tea; Register File, Tpd; ALU, Tpd; Register File, Set-up. 

4. Flow through calculation with address supplied by operand counter; counter output enabled prior cycle. 

Pipeline, Tco; Operand Counter, Tco; Register File, Tpd; ALU, Tpd; Register File, Set-up. 

5. First cycle of FPP Newton-Raphson division, seed value load. 

Pipeline, Tco; Control Decode, Tpd; Seed Register, Tzh; FPP Internal Register Set-up, Tsd2. 

6. Memory read with address from the register file, selected by microcode. 

Pipeline, Tco; Address Mux, Tpd; Register File, Taa; Memory Address Buffer, Tpd; Memory, Taa; Register File, Set-up. 

7. Memory read with address from a memory address counter. 

Pipeline, Tco; Control Decode, Tpd; Memory Address Counter, Tzh; Memory, Taa; Register File, Set-up. 

8. Memory Write with data from register file, selected by operand counter. 

Pipeline, Tco; Operand Counter, Tea; Register File, Taa; Memory Address Buffer, Tpd; Memory, Write Set-up. 

9. Move register file data to interrupt controller or sequencer, data selected by operand counter. 

Pipeline, Tco; Operand Counter, Tea; Register File, Taa; A to D Bus Xcever, Tpd; Interrupt Controller, Data Set-up. 

10. Move sequencer or interrupt controller data to register file. 

Pipeline, Tco; Control Decode, Tpd; Sequencer, OED to D; D to A Bus Xcever, Tpd; Parity Buffer, Tpd; ALU, Tpd; Register File, Set-up. 

11. Sequencer branch, conditional or unconditional. 

Pipeline, Tco; Pipeline Branch Field, Tzh; Sequencer, D to Y; Control Store, Address Set-up. 

12. Sequencer ihtermpt or trap cycle. 

Trap Logic, Clock to INTR; Sequencer, INTR to INTA; Trap Logic, Tea; Control Store, Address Set-up. 

13. Sequencer branch to macro opcode specified instruction. 

Macro Opcode Register, Tco; Map RAM, Taa; Sequencer A to Y, Control Store, Address Set-up. 
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Physical Issues 



ELECTRICAL LAYOUT ISSUES FOR 
POWER SUPPLY 

The TTL compatible, bipolar, Am29300 family compo- 
nents all use internal ECL circuitry with TTL compatible 
I/O buffers. 

Each part has a large number of output buffers due to the 
32-bit output bus, plus various status outputs. 

These two facts can make the real world interesting. 

When a large number of the output buffers switch simul- 
taneously, the local Printed Circuit Board (PCB) power 
and ground, and the chip internal power supply lines can 
experience significant noise transients. 

This power supply noise can couple into the internal 
logic's ECL VCC pins. Since the internal ECL cirojilry is 
referenced to the ECL VCC, the power supply noise can 
cause short duration shifts in the threshold levels of the 
internal logic. 

Due to the way ECL circuitry operates, it has much 
smaller noise margins than equivalent TTL circuits. The 
threshold shifts result in lowerthan normal noise margins 
in already sensitive high speed circuits. These reduced 
noise margins can result in noise-induced logic errors. 

It is, therefore, very important to provide very good power 
distribution and decoupling in a system using the 
Am29300 family. It is strongly suggested that a multi- 
layer PCB be used to provide power and ground planes. 
It is also important to minimize coupling between the 
TTL and ECL VCC pins of any Am29300 bipolar device. 
This can be done in part through good power supply de- 
coupling. 

An additio nal way to decouple the ECL and TTL VCC pi ns 
is to introduce inductive isolation. The simplest way to do 
that is to place a cut in the VCC plane that separates the 
ECL supply pins from the TTL pins. This produces a 



longer electrical path between the pins, which adds 
inductance between the pins. This inductive isolation will 
significantly reduce noise coupling. 

Some suggested PCB layouts for use with the Am29300 
family are shown in.Flgures 7-1a and 7-1b. The images 
are negatives where black indicates an absence of metal 
in the VCC plane. 

Although significant noise can also occuron the TTL and 
ECL ground lines, the ECL circuits are much less sensi- 
tive to this noise. Attempting to isolate the TTL and ECL 
ground pins from each other can create more problems 
than it solves. Any isolation will reduce the noise in the 
ECL circuitry and thereby make the chip internal ECL 
ground "different" from the TTL ground. This can reduce 
the noise margin in the ECL to TTL conversion logic, 
introducing potential for noise induced errors. It is recom- 
mended that no isolation between grounds be used. 

DECOUPLING CAPACITORS 

An added help in providing local VCC to ground decou- 
pling is available in the form of under-chip capacitors. 

Special capacitors for PGA device packages have been 
developed by Rogers Corporation, Q-PAC Div., 2400 
South Roosevelt St., Tempe, AZ. 85282. 

SOCKETS 

Whenever high pin count, expensive VLSI components 
are used in a system, many hardware designers prefer to 
have the devices in sockets. This allows easy removal for 
repairs or upgrades and provides an additional test point 
in the system. 

Sockets for the Am29300 family are available from Augat 
Corporation, Interconnection Component Div. 33 Perry 
Ave. Attleboro, MA. 02703. 
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SECTION 8 

Conclusion 



There are many ways to skin a cat and surprisingly, 
many more ways to build a computer. Tliis application 
note has tried to guide the reader through just one 
simple implementation. The author hopes some of the 
reasons behind the design choices in a microprogram- 
med computer design were made clear during the course 
of the description. 

Aside from some general notions about how a micropro- 
grammed system works, the reader should walk away 
having noted the following thoughts: 

This design is a full 32-bit processor capable of executing 
a full 32-bit add, barrel shift, logical, integer multiply, or 
even floating point multiply every 100 ns to 133 ns. That 
is a 7 to 10 Million Instructions Per Second (MIPS) rate, 
which is (loosely) comparable to 7 times the performance 
of a VAX 11/780. 

For all that computing horsepower, the real core of this 
machine is made from only 6 chips: the Am29300 family 
of computer building blocks. That's an incredible degree 



of integration as compai'ed with previous approaches to 
high perfomiance microprogrammed computer design. 

Most of the logic surrounding the Am29300 family com- 
ponents is not required. The additional logic is used to 
add system flexibility and to show off different aspects of 
microprogrammed design. Very little glue is needed to 
hold this family together. 

There is very little in the way of standard SSI logic 
used. Virtually all the MSI and SSI level logic functions 
were incorporated into Programmable Array Logic. 
This shows the versatility and integration that PALs can 
provide. 

Due to use of Serial Shadow Registers throughout the 
system, there is a reasonable hope that enough of the 
system state can be read and controlled so that debug- 
ging in the factory or field will be simple. This access to 
the internal structure of the machine is gained with very 
little "excess" logic. 



This application note, augmented by 60 pages of PAL 
andAm29PL141 definition fiies is available as a 
separate booklet; Publication No. 09856A. 
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Product Application 



(a) 



The fast way to build 
a RISC processor 

A famUy of 32 -bit VLSI ICs yields 

reduced instruction-set computers 

with a variety of architectures 

Dhaval Ajmera and Cheng-Gang Kong 

Production Planning and Development Engineers 

Microprogrammable Processes 

Advanced Micro Devices, Inc., Sunnyvale, CA 



(b) 



Central processing units with re- 
duced instruction sets fall into 
two categories. Single-chip ver- 
sions are champion performers, but 
their fixed instruction sets mean that 
software compatibility can be a prob- 
lem. Others are built from an army 
of discrete components and small-, 
medium-, and large-scale ICs i SSI, 
MSI, and LSI) and so suffer from 
high chip counts, long interchip de- 
lays, and great power dissipation. 

A good compromise between the 
two is a team of a few very large- 



scale IC (VLSI) parts— namely, the 
bipolar Am29300 and CMOS Am- 
29C3CKHfamiliesof VLSI building 
blocks (see 6oi, "VLSI RISC"). By 
using these families, it is possible to 
adapt an operating system and in- 
struction set to a reduced -instruc- 
tion-set computer (RISC) architec- 
ture while maintaining software 
compatibility. 

As a family, the 29300 can support 
the extremely fast cycle time of 80 
ns,and both it and the29C300 group 
have a 32bit fixed word length. That 



OPCODE (?) 


SCC(l) 


OESINATION (5) 


SOURCE 1 15) 


IMM (1) 


SOURCE 2 113) 



EXE 



KEY 

IF = INSTRUCTION FETCH 

EXE = EXECUTION 



DELAYED 
BRANCH 



KEY 

SCO = SET CONDITION CODE 

IMM = IMMEDIATE 



EXE 










DELAYED 








BRANCH 




IF 






EXE 
















IF 







Fig. 1. The RISC word lor both the 
Berkeley and the AMD reduced 
instruction set is lixed at 32 bits (a). 
In the AMD RISC hardware, the 
pipeline structure consists of a 
simple, two-level instruction-fetch- 
and-execute configuration (b). 



Reprinted with permission from Electronic Products, Vol. 29 No. 12, November 17, 
1986. Copyright 1986, hiearst Business Communications Inc. 
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word length affords high precision 
for arithmetic operations as well as 
a wide bandwidth for memory and a 
large t4-gigabyte) addressing capa- 
bility for virtual-memory operations. 

Each family member fulfills a dis- 
tinct function, allowing the RISC 
designer considerable freedom to 
configure them in a variety of archi- 
tectures. Because, for example, the 
Am29334 register file building block 
is functionally separate from the Am- 
29332 arithmetic logic unit (ALU), 
several Am29334 can be used to 
vary the size of the register file as 
required. In addition, data from the 
registers can be shared by other par- 
allel devices besides the ALU. 

The high level of integration of the 
29300and 29C300family members fa- 
vors higher performance because in- 
terchip delays are shorter. Also, sys- 
tems need fewer and smaller boards 
to mount a lower parts count, and 
less power is dissipated — both fac- 
tors that tend to reduce costs. 

The AMD RISC architecture 
closely resembles the RISC I devel- 



VLSI RISC 



A reduced-instruction-set processor 
could be designed onto a custom VLSI 
chip— for a price. Or it could be con- 
structed from numerous, less Integrated 
ICs — In many manhours. The golden 
mean, however, Is to turn to already 
available general-purpose VLSI building 
blocks, for these simplify the design job 
yet can be obtained off the shelf. The 
Am29300 family from Advanced Micro 
Devices In Sunnyvale, CA, Includes the 
32-bit arithmetic logic unit, the 32-blt 
register file, and the bounds checker 



needed to build the RISC described In the 
accompanying article. 

The Am29332 ALU Is housed in a 168- 
pln grid array and sells for $495 each In 
100-unit quantities. The Am29334 four- 
port, dual-access register file Is pack- 
aged In the 120-pln grid array and sells 
for $180 each in 100-unit quantities. The 
Am29337 bounds checker comes in 28- 
pln ceramic DIP and Is priced at $22 In 
100-unit quantities. Other building blocks 
In the Air29300 family are available. 



oped at the University of California 
at Berkeley, which has 33 instruc- 
tions. Basic to both architectures is 
a fixed instruction format. 

Every instruction word is 32 bits 
wide {see Fig. 7a) . Its op code occu- 
pies a field of 7 bits. Three fields 
totaling 23 more bits specify two 
source operands and a single des- 
tination. These fields are always in 
the same position in the instruction 
word — an arrangement that makes it 
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TTTTTT 

7-BIT INSTRUCTION 
CODE TO ALU 
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GENERATION 
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ADDRESS BUS 
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DATA BUS 



relatively simple to decode the op 
code in parallel with the operand 
access. 

A two-level pipeline 

The pipeline of the AMD RISC 
is a simple, two-level structure. One 
level fetches an instruction while the 
other is executing the instruction 
fetched immediately beforehand 
(see Fig. lb). 

This concurrency, however, cre- 
ates difficulties with branch instruc- 
tions. A conditional branch instruc- 
tion cannot make its condition avail- 
able until it has been executed. 
Therefore, the instruction fetched 
during its execution might not be the 
correct one. 

To circumvent this pipeline lock- 
step dependency, a method called 
delayed branch is used. A code re- 
organizer (a program) rearranges 
the sequence of instructions so that 
the one immediately following the 
branch instruction is always exe- 
cuted despite the branching condi- 
tion (see Fig. lb again) . In 9 out of 
10 cases, a useful operation can be 
inserted. The rest of the time a NOP 
fills in. In other words, whatever the 
result of the branch instruction, it is 
executed only after an intervening 



Fig. 2. The AMD RISC system includes a 
set of four Am29334 registers and an 
Am29332 ALU, which derives its 7 -bit 
op-code controls from a PLA. The Am- 
29337 bounds checker identifies all 
memory references to the file registers. 



CONSTANT 
GENERATOR 



:> 
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Integrated Circuits 



instruction has been dealt with. 

Kxceptions are another pipeline 
hazard. When one occurs, the pipe- 
line contents are duplicated by three 
registers in the program counter 
unit. This unit is routed to the ALU 
through the A multiplexer (see Fig. 
2) — a feature that allows the return 
address to be saved when a call in- 
struction is executed. During excep- 
tion handling, this path also makes 
it possible to save the contents of the 
three program counter registers and 
to use them to restart the processor. 



INCOMING 
PARAMETERS 



LOCAL 



OUTGOING 
PARAMETERS 



GLOBAL 



"22 



derived from the instruction's 7-bit 
op code through a programmable 
logic array (PLA) . The Am29332 is 
a 32-bit-wide ALU that performs all 
arithmetic and Boolean operations. 
A high data-transfer rate is provided 
by a powerful, orthogonal instruction 
set. To enhance system performance, 
the device also features a 64-bit-in, 
32-bit-out funnel shifter, as well as 
a 32-bit barrel shifter and a priority 
encoder. 



(B) 



Fig. 3. The register window ot the 
AMD RISC is lunctionally divided 
into four sections (a). Every proce- 
dure of the program shares the 10 
global registers (b). 



PROCEDURE A 



in the execution of high-level lan- 
guages. 

Four Am29334s, with the aid of 
some SSI and MSI chips, provide 
seven register windows and 10 global 
registers. Altogether, they easily fit 
onto a standard hex card. 

One register window is allocated 
to each procedure. Each window 
consists of 32 registers; thus at any 
time just 32 registers are visible to 
the currently executing procedure. 
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16-BIT 
OFFSET 



ROq 
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R6c 
RlBc 
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The instruction set enables con- 
stants to be formed through the in- 
struction word directly. Before a 
constant can be fed into the ALU, 
however, some data has to be re- 
routed to generate it. This rerouting 
is done by the constant generator, 
which in essence uses 32 two-input 
multiplexers to produce the proper 
constant. The result is then fed via 
the B multiplexer to an ALU input 

The control section of the AMD 
RISC is relatively simple ( see Fig. 
2 again). All the control signals are 



The Am29334 register tile is a 
four-port, dual-access file that can be 
used to implement a distinctive fea- 
ture of the Berkeley RISC— its so- 
called overlapped register windows. 
This overlapping improves the speed 
at which the procedures i or subrou- 
tines ) in an application program can 
pass parameters among themselves 
and the main program in a call-re- 
turn sequence. Berkeley researchers 
developed the technique after find- 
ing that parameter passing is one 
of the most time-consuming events 



The 32 are functionally partitioned 
into four sections: 10 global and 10 
local registers, as well as 6 apiece for 
incoming and outgoing parameters 
I see Fig. 3a). I In the Berkeley 
RISC, there are 138 registers 
grouped into 8 register windows. ) 

The 10 global registers (Ro, to 
R., ] ) are shared by every procedure 
of the program ( see Fig. 3b ) . They 
are used primarily for globally ref- 
erenced items such as a system's 
commonly applicable constants. 

The 10 local registers (R,., to 
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Fig. 4. The AMD and Berkeley RISC register numbering areVs 
complements oi each other (a). Also, either procedure can only be 
translated into the other if they are mapped one on one (b). Both the 
1's complementing and the mapping are simple operations. 



LOWER BOUND - 



UPPER BOUND - 



Rj 5 ) , dedicated to the procedure it- 
self, store local variables. 

Six registers (Rq to R^) accept 
incoming parameters from the call- 
ing procedure for use by the called 
procedure. They are also used to re- 
turn results from the called to the 
calling procedure. 

When the called procedure in turn 
summons another, it puts its outgo- 
ing parameters in six registers (Rk; 
to R, 1 ) that then overlap the six in- 



coming-parameter registers of this 
last procedure. 

With such a register organization, 
parameters can be rapidly trans- 
ferred between procedures, as the 
three register windows in Figure 3b 
illustrate. When procedure A calls 
procedure B, all the parameters pass 
through the outgoing-parameter reg- 
isters of A to become the incoming- 
parameter registers of B, which can 
operate on these parameters without 



ON-CHIP 
REGISTER FILE 



32-Bit Computer Performance Benchmarks 



Benchinaik 


AMD RISC 
(ms) 


Berkeley RISC1 
(ms) 


Typical 32-blt 
(mi) 


E-string search 


0.115 


0.46 


0.59 


F-bittest 


0.015 


0.06 


0.29 


H-llnked list 


0.025 


0.10 


0.12 


K-bit matrix 


0.108 


0.43 


1.29 


l-quicl(sort 


12.6 


50.4 


151.2 


Ackermar (3.6) 


BOO 


3,200 


5,120 


Recursive Q sort 


200 


800 


1,840 


Puzzle (subscript) 


1,175 


4,700 


9,400 


Puzzle (pointer) 


800 


3,200 


4,160 


SED (batch editor) 


1,275 


5,100 


5,610 


TowersofHanol(18) 


1,700 


6,800 


12,240 


Average times faster 


8 


4 


1 



(b) 



accessing the stack memory. The 
same principle applies when B calls 
C. When C finishes, the results re- 
turn through the outgoing parame- 
ters of B (or incoming of C) . In turn, 
B also returns its results through the 
outgoing parameters of A. 

The register numbering used in 
the AMD RISC for the windowing 
scheme is the I's complement of its 
Berkeley RISC counterpart, a con- 
vention easily implemented with 
simple address-generation logic (see 
Fig. 4a). (A one-to-one mapping still 
remains between these two proces- 
sors after this numbering change.) 

The address generation logic maps 
any register number greater than 21 
into the global register. The mapping 
is done by appending the lower 4 bits 
of the register specifier to three Is. 
This operation maps it to a high ad- 
dress in the register file. 

To generate the address of a local 
register, the pointer to the current 
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window (logically a 7-bit register) 
is added to the register specifier. The 
current-window pointer is the base 
pointer for the currently visible reg- 
isters. It is advanced to the next win- 
dow base pointer when a call instruc- 
tion is executed; it is restored to the 
previous window base pointer when 
a return is executed. Since each reg- 
ister window is offset from the pre- 
vious window by 16 registers (due to 
the overlap illustrated in Fig. 3b), 
the lower 4 bits of the current-win- 
dow pointer are always zero. There- 
fore, an incrementer at the fifth bit 
position of this pointer can be used 



to add in the register specifier. Thus 
connecting the fifth bit of the register 
specifier to the carry-in of the cur- 
rent-window pointer's incrementer 
generates the proper address for reg- 
isters to 21. 

The comparator generates the 
proper select signal to gate the ap- 
propriate address (global or local) 
to the register file. With the pro- 
jected 80 ns of the combined propa- 
gation delay of the Am29332 and 
Am29334, a l(K)-ns system cycle time 
can be easily obtained. 

The register file, part of the sys- 
tem's run-lime stack, is mapped into 



Rise's minimalist philosophy 



A new style of computer architecture has 
stirred a lot of attention recently, lis 
called RISC, for reduced instruction-set 
computer. Examples of it are the Univer- 
sity of California at Berkeley's RISC I 
and RISC II. IBM's 801 project, and Stan- 
ford University's MIPS (for microproces- 
sor without interlocked pipe stages). 

The time-honored route in system de- 
sign has been to leverage on progress in 
IC technology by increasing the complex- 
ity of computer architecture, with the 
goal of narrowing the "semantic gap" 
between the high-level languages of pro- 
gramming and the bit languages of ma- 
chines. Complex instruction-set com- 
puters, or CISCs. are one result. But the 
side effects are unpleasant— longer de- 
sign times, more numerous design er- 
rors, and inconsistent implementations. 

This outcome triggered an about-turn 
in favor of simplicity. RISC designers try 
to select only the most frequently used, 
primitive Instructions and to execute 
them very fast. Some of the main archi- 
tectural design principles of the RISC 
are: 
• Execute one instruction per cycle. 

Program traces show that the most 
heavily used instructions are quite primi- 
tive. They also execute in one cycle. 
Hardwiring instead of microprogramming 
them enhances overall performance by 
eliminating the overhead incurred in mi- 
crocode interpretation. The lengthy, 
highly complex, and infrequently sum- 
moned instructions provided by the CISC 
but omitted on the RISC can be imple- 



mented by software subroutines. 

• Use a fixed instruction format. 

A fixed instruction format greatly sim- 
plifies instruction decoding and thus the 
hardware. Each field of the instruction 
word is dedicated to a particular function. 
For example, a fixed field is dedicated to 
the op code, and two or three fields are 
dedicated to operand specifiers. An added 
benefit is that an instruction with this 
format may allow some signals to be de- 
rived directly from it, permitting several 
operations to overlap. 

• Employ a load/store architecture. 
Memory references alone are done by 

load- or store-register operations. All the 
other operations are register-to-register. 
The simplicity of this addressing mode 
makes it easy to implement. The absence 
of complex addressing modes also makes 
it easier to restart instructions when an 
exception occurs. 

• Support high-level languages. 

The simple instruction set supplies the 
compiler with only the most primitive op- 
erations. From these the compiler can 
compose instruction sequences that are 
tailored to the exact requirements of the 
programming language. In some archi- 
tectures, the hardware savings realized 
by the simple implementation is invested 
in speeding up some of the high-level 
language's more time-consuming opera- 
tions. The University of California at 
Berkeley RISC processor, tor instance, 
includes a large register file for speeding 
up the sequence of calling and returning 
from a procedur«. 



the main memory (see Fig. 4b) . The 
Am29337 bound-checking facility 
detects any memory reference to this 
section and reports it to the CPU. 
The CPU can then redirect the ref- 
erence to the proper data store in the 
register file. 

Performance evaluation 

Usually it is hard to compare one 
architecture to another with any ac- 
curacy. The AMD RISC, though, is 
functionally compatible with Berke- 
ley's RISC I, so that published pa- 
rameters can serve as a basis for pre- 
dicting their relative performance. 
The comparison is also predicated 
upon the following four assump- 
tions: 

• A 100-ns cycle time. The Am29332 
and Am29334 will contribute 80 ns 
to the total cycle time, and the regis- 
ter address generator and source 
multiplexer add another 20 ns (pro- 
vided Schottky TTL components 
form the glue logic of the circuit) . 

• A 100-ns instruction cache. It has 
been established that an 8-Kbyte di- 
rectly mapped instruction cache can 
provide a hit ratio of 99.8 % on VAX- 
11 (programs written in C and run- 
ning under Unix). High-speed 
RAMs (arotmd 45 ns) are available 
from which a 100-ns instruction- 
cache memory with a good hit ratio 
can be easily constructed. 

• The execution of the same instruc- 
tions as RISC I. Register renaming 
of the code is easy. 

• No adverse impact on performance 
due to the AMD's RISC having one 
fewer register window (Berkeley's 
RISC I has eight register windows 
versus seven for AMD) . 

For a simulated RISC I running 
11 benchmark programs written in 
C, the system cycle time was 400 ns. 
For the AMD system running the 
same programs, it was 100 ns, or four 
times shorter. Further, as the table 
indicates, the AMD implementation 
averages about eight times faster 
than a typical 32-bit superminicom- 
puter. □ 
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FAULT-TOLERANT CHIPS 
INCREASE SYSTEM 
RELIABILITY 

Using parity checking and a master/slave duplication 
technique, a bipolar chip set provides an interlocking 
fault-detection scheme that enhances fault tolerance. 



by Tim Olson 



Fault-tolerant computers have been used in satellites, 
aircraft, and industrial control and communications 
applications. The use of fault-tolerant techniques is 
currently being extended into other arenas, includ- 
ing on-line transaction processing and increasingly 
complex very large-scale integration circuitry. In 
addition, the rising cost of system maintenance and 
repair is causing a demand for fault-tolerant system 
building blocks that enhance system availability and 
reliability. 

The Advanced Micro Devices 32-bit, micropro- 
grammable chip set addresses these needs. The 
Am29300 family, which consists of the Am29332 
arithmetic logic unit (ALU), Am29331 sequencer, 
Am29334 register file, Am29325 floating-point pro- 
cessor and Am29323 muhiplier, uses an interlock- 
ing fault-detection scheme to provide fault tolerance. 
This detection scheme consists of a parity-check sys- 
tem and a master/slave duplication technique. 

Add a bit 

Parity-check codes are a form of error detection 
in which a single parity bit is appended to a group 
of data bits. The addition of this single bit changes 
the number of zeros and ones within the bit group. 
If, with the addition of the parity bit, the group has 
an even number of ones, the group has even parity; 



Tim Olson is a product engineer for Advanced Micro 
Devices (Sunnyvale, CA). He holds an MS in electri- 
cal engineering from the University of Arizona. 
Order # 08087A 
Reprinted with permission from Computer Design. 




if it has an odd number of ones, the group has odd 
parity. Parity-check codes can detect all single-bit 
errors, as well as errors that involve an odd num- 
ber of bits. For groups with an odd number of bits, 
even parity can detect the all-ones condition and odd 
parity, the all-zeros condition. 

To detect data-transmission errors, the Am2930O 
family checks parity according to bytes. In this 
scheme, a parity bit is appended to each byte in the 
32-bit word, resulting in four 9-bit groups. Each 
group contains a single parity bit. There are three 
reasons for using byte parity: fauh coverage, de- 
creased cycle time and byte-write capabiUty. Fault 
coverage is increased by providing a single parity bit 
per byte. This technique catches many faults that 
would go undetected if a single parity bit per word 
were used. 

Decreased cycle time refers to the fact that four 
parity bytes operating in parallel can generate and 
perform a parity check faster than a single 32-bit 
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Advanced Micro Devices' Am29300 32-lrit, bipolar, 
microprogrammable chip se< consists of five devices 
that support fault-tolerant itesigns hy providing parity 
cheddng/generatian (a) and master/dave dupHcatioa 
(b) as fantt-dctection technkines. Parity cheddag pro- 
vides faidt coverage for data storage and interchip 
coDBCcHoBS. More elaborate coverage is provided by 
master/dave checking. With this techniqae, two iden- 
tical copies of a device are used in parallel, with one 
designated as raa^er and the other as stave. For in- 
creased reliabaity, even the checking scheme can be 
checked. 



parity-generation system. Byte-write capability pro- 
vides other advantages. In byte parity, individual 
bytes can be written back into the register file with- 
out reading the rest of the 32-bit word to compute 
parity. 



The Am29300 family uses even parity, which ex- 
tends fault coverage to include a floating input bus. 
This parity scheme includes an all-ones failure mode, 
which occurs if a failure in the source device pre- 
vents it from driving the bus or if a failure in the 
control path prevents the source device from being 
accessed. Parity bits are stored in the register file, 
checked when input to the ALU and multiplier, and 
then generated as an output. If a parity error is de- 
tected on either of the two input buses, the Parity- 
Error output is asserted. This output is active high 
to provide fault detection for the error signals. 

This parity scheme provides fault detection on 
both the data storage and the interchip connections. 
Since the Am29332 ALU and the Am29323 multi- 
plier perform operations on data that cannot carry 
parity bits, however, a more elaborate checking 
scheme is used. This system is called master/slave 
checking. 

More than one copy 

Master/slave checking uses duplication as a fauh- 
detection technique. Two identical copies of a de- 
vice are used in parallel; one is designated as mas- 
ter, the other as slave. The master device computes 
a result from the inputs and moves its result to the 
chip outputs. The slave device also computes a re- 
sult from the inputs, but all of its outputs (except 
for MS-Error) are changed to inputs that carry the 
resuhs of the master. 

The slave compares its result with the resuh of the 
master and signals any discrepancy on the MS-Error 
output. This output, like Parity-Error, is active high 
to provide fault detection for the error signal. Mas- 
ter/slave checking can detect multiple failures in both 
the master and the slave devices, as long as at least 
one failure is nonoverlapping. This checking system 
also detects output bus contention, which is indicated 
by the MS-Error output on the master device. This 
output is activated when the master result and the 
output bus fail to match. 

For systems that must operate nonstop, master/ 
slave techniques may also be applied at the board 
level. Two sets of master/slave pairs are used; one 
is active and the other is standby. If the slave of the 
active pair signals an MS-Error, the active pair is 
turned off and the standby pair is activated. The 
standby pair may also perform transactions while 
the active pair is running, resulting in twice the 
throughput of normal operation. 

The ALU, multiplier and sequencer all have a 
master /slave operation mode. This mode, combined 
with parity checking of the data paths, provides com- 
plete interlocking fault detection on a cycle-by- 
cycle basis. 
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The fault recovery process can identify two types 
of faults: permanent or transient. Permanent, or 
hard faults, are caused by physical changes in the 
hardware (failures), while transient, or soft faults, 
are due to unstable hardware or temporary environ- 
mental conditions. Detection of a permanent fault 
may cause a standby unit to take over for the failed 
device. 

When transient fauhs are detected, on the other 
hand, the microinstruction that faulted will be 
restarted after the transient condition disappears. In 
either case, the faulted microinstruction must be 
aborted, so that no state change occurs to disrupt 
the restarting of the microinstruction. 

To restart the microinstruction, the sequencer per- 
forms traps at any microinstruction boundary. When 
a trap condition is signaled by the simultaneous as- 
sertion of the interrupt request and force continue 
signals with the Carry input (Cjn) signal disabled, 
the address incrementer to pass the current address 
instead of the next address, the sequencer puts the 
Y output bus in a high-impedance state. This allows 
an external trap vector to be placed on it. The se- 
quencer then pushes the trapped microinstruction 



address onto the internal stack and starts fetching 
microinstructions, using the trap vector as the start- 
ing address. The aborted microinstruction is stored 
on top of the stack and is restarted by executing a 
return instruction. When the Hold input is assert- 
ed, updates of the ALU's internal state are inhibit- 
ed. This ensures that the aborted microinstruction 
has no effect. 

Fault-tolerant CPU design 

In order to show how the Am29300 family mem- 
bers interact to perform fault detection, recovery and 
isolation, consider a simple CPU design. In this de- 
sign, the data path consists of two sets of register 
files and two ALUs in a master/slave configuration. 
Because new data may already have been written to 
the register file before a fauh is signaled, two reg- 
ister file sets are required. One register file set holds 
the working address and data registers, while the 
other set holds backup copies of these registers that 
are used in error recovery. 

The ALUs perform address and data calculations, 
which are used to address memory via the data-out, 
data-in and address registers. These registers are built 
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sequencer changes the Y output bus to a high-impedance state, allowing an external trap to be placed on it. 
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from Am29818 diagnostics registers that offer off- 
line testing and fault diagnosis. The control path 
starts with the instruction register, which consists 
of four serial shadow registers. The instruction is 
applied to a mapping PROM to derive the starting 
microcode address for the sequencer, which is built 
from two Am29331 sequencers in a master /slave con- 
figuration. The microinstruction is fetched from the 
writable control store and loaded into pipeline reg- 
isters, which distribute control throughout the CPU. 

Fault detection, recovery and isolation 

During instruction execution, errors are detected 
on a cycle-by-cycle basis by the sequencer and ALU 
master/slave pairs. They are signaled with the Parity- 
Error and MS-Error outputs. These error signals are 
prioritized by a vectored priority-interrupt controller, 
which causes the sequencers to trap the microinstruc- 
tion that is currently executing. The trap vector is 
then put on the Y output bus. The controller also 
asserts the Hold pin on the ALUs, which prevents 
the trapped microinstruction from updating the in- 
ternal state of the ALU and disables writes to the 



backup register file. Writes to the backup register 
file are disabled, keeping the state of the ALU prior 
to the trapped microinstruction intact. 

Microinstruction processing then begins with the 
trap routine associated with the highest priority fault 
indication. This routine can determine whether the 
fault is transient or permanent. If the fauh is tran- 
sient, the trapped microinstruction must be restarted. 
The trap handler first restores the state of the reg- 
ister file by copying each of the registers in the back- 
up register file into the working register file, restoring 
the registers to the values they held prior to the fault. 
Any other state that was saved during trap process- 
ing is also restored during this process. The se- 
quencers then perform a return instruction, popping 
the trapped microinstruction address from the stack. 

To increase system availability, permanent faults 
must be isolated quickly. This usually involves run- 
ning a series of test patterns through the devices to 
determine which ones have failed. These patterns can 
be loaded and tested quickly using the serial shadow 
registers. All of the serial shadow registers in the 
CPU design are connected by a serial link that forms 



a diagnostics loop. Arbitrary patterns can be loaded 
serially through the loop, then clocked through in 
a single system cycle. The resulting state can be read 
out from the loop for use in isolating the failed 
device. 

Checking the checkers 

Failures in checking devices are even more serious. 
A failed checker can give a false indication of error 
or a no-error condition. While false indications of 
failure are tolerable, a no-error condition often re- 
sults in undetected faults. 

There are three basic fault detectors in the CPU 
design: the Am29332 parity checker, the Am29332 
master/slave checker and the Am29331 master/slave 
checker. These fault-detection circuits must be veri- 
fied during system initialization, and their opera- 
tional status should be confirmed periodically during 
subsequent operation. 

Fault injection, which is the process of deliber- 
ately causing a fault in the part of the system that 
is checked by the fault-detection hardware, can be 
used to perform this verification. The parity-check 



circuitry can be tested by loading a word with bad 
parity into the data-in register via the serial link. It 
is then loaded into the register file and used in an 
ALU operation. This procedure should detect a par- 
ity error. 

Another method of verifying the parity checker 
is to issue a microcode instruction that performs an 
ALU operation while the register-file outputs are in 
a high-impedance state. The parity checker should 
detect the all-ones condition and flag the error. 

Master/slave checking can be verified on the ALU 
by using the Hold input. The status registers in the 
master and slave are first set to a known equivalent 
state. The next microinstruction alters that state, but 
asserts the Hold input on one of the devices, inhibit- 
ing the status update. A master/slave error, caused 
by the differing status outputs, should occur. Mas- 
ter/slave checking can also be verified on the 
Am29331 sequencers by executing a jump instruc- 
tion while asserting the force-continue input on one 
of the parts. The part without the asserted force- 
continue input executes the jump, causing a nonse- 
quential address for the next microinstruction. The 
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force continue asserted on the other sequencer over- 
rides the jump instruction, causing the next microin- 
struction address to be sequential. This results in 
differing addresses which, in turn, causes a master/ 
slave error. 

The AMD family extends many of the concepts 
of fault-tolerant computing, including parity check- 
ing and master/slave duplication into the 32-bit are- 
na. This fault-detection scheme can identify both 
permanent and transient faults, ensuring broad- 
based fault protection throughout the system. CD 
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Designer's Guide to: 
Floating-point processing — Part 1 



Floating-point math 
handles iterative and 
recursive algorithms 



Floating-point arithmetic gives you better dynamic 
range and precision than integer arithmetic, but it 
needs careful implementatum. Part 1 of this 3-part 
series discusses possible sources of error you may 
encounter when using floating-point hardware, and it 
reviews the current standards. Part 2 will describe 
the advantages of fast array processors, and part S 
will discuss algorithmic options for floating-point 
processors and considerations when implementing a 
complete system. 

Charlie Ashton, Advanced Micro Devices Inc 

Many signal-processing algorithms, such as fast Fou- 
rier transforms, generate outputs whose magnitudes 
far exceed those of the inputs. Nevertheless, those 
outputs must retain the precision of the input operands 
if the accuracy of the computation is not to be so 
severely degraded as to render the results meaning- 
less. For these and similar applications that use itera- 
tive or recursive algorithms, true floating-point opera- 
tion often furnishes the only acceptable number 
representation. 
Until recently, you needed a very good reason to give 

Reprinted with permission from EDN, January 9, 1986. Copyright 1986, Reed 
Publishing USA. 
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your system floating-point hardw^are. It was large, 
expensive, power-hungry, and relatively slow (al- 
though faster than the software-based implementations 
needed to perform comparable operations). However, 
the introduction of fast VLSI array processors has 
changed the picture. These devices (such as Weitek's 
1032/1033 and AMD's Am29325) can stand alone and are 
implemented on one or two chips. You can now economi- 
cally use floating-point hardware in applications whose 
size and budget constraints would previously have 
forced the use of fixed-point hardware or floating-point 
software. 

The new chips won't dissipate all your potential 
headaches, of course. Just one of the many choices you'll 
have to make is which standard to support. The four 
most commonly used standards (IEEE, DEC, IBM, 
and MIL-STD-1750A) have subtly different binary rep- 
resentations of floating-point numbers. Each standard 
has advantages and disadvantages for specific types of 
computational problems. This series of articles covers 
some of the theoretical considerations you'll have to 
take into account, as well as some specifics on the 
available chips. 

The manner in which a system represents floating- 
point numbers clearly affects both the dynamic range 
and the precision of the system. The most obvious way 
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VLSI processofs ttow tnake floating-point 
hardware cost effective in applications with 
severe budget or size constraints. 



to represent numbers is to use a signed exponent and a 
signed fraction (Table 1). A large exponent field obvi- 
ously supports a large dynamic range: A 2-digit expo- 
nent, for example, implies a dynamic range of 10"*, 
whereas a 3-digit exponent increases the dynamic 
range to 10"**. Similarly, the more digits you can 
include in the fraction, the greater will be the precision 
of the number, especially if the number is normalized so 
that the left-most digit of the fraction is nonzero. 
Leading zeros in the fraction of an unnormalized num- 
ber clearly reduce the precision of that number. As a 
general principle, then, the precision of a floating-point 



TABLE 1— 


SIGNED vs BIASED EXPONENTS 


DECIMAL 
NUMBER 




SIGNED 
EXPONENT 




FRACTION 


-12145 


= 


10 *» 


X . 


-0.12345 


+0.0OO0678 


= 


10- 


X 


0.678 


DECIMAL 
NUMBER 




BIASED 
EXPONENT 




FRACTION 


-123.45 


= 


6+3=8 


X 


0.12346 


+0.0000678 


- 


6-4=1 


X 


0.678 



number depends on the length of its fraction, and the 
dynamic range depends on the size of the exponent and 
the radix. 

In practice, floating-point hardware generally uses a 
biased exponent for two reasons. First, use of a biased 
exponent avoids problems that follow from the need to 
handle negative numbers in the exponent circuitry. 
Second (and perhaps more important), a suitable choice 
of bias can ensure that you'll be able to compute the 
reciprocals of all the representable numbers without 
exponential overflow or underflow. You'll find that 
overflow and underflow cause plenty of problems in 
computing the fraction portion of the output (see box, 
"Dealing with underflow and overflow"). You certainly 
don't want to introduce them into exponential computa- 
tions as well. 

Biased exponents and normalized fractions are the 
features that give true floating-point representation a 
clear advantage over block floating-point and integer 
formats. To double the dynamic range of an integer 
word, you have to double the number of bits in it. To 
obtain the same result in true floating-point operation, 
you need to add only one bit to the exponential field. In 



fact, a 32-bit floating-point number in IEEE format has 
a dynamic range equivalent to that of a 276-bit 2's- 
complement integer. 

Despite the high precision and large dynamic range 
of normalized floating-point numbers, floating-point 
systems do not altogether escape the effect of quantiza- 
tion (rounding) errors. You can think of a floating-point 
system as producing an infinitely precise result (ie, a 
fraction of unlimited length, abbreviated "IPR"), which 
is then rounded to fit into the destination format. 
Typically, this strategy means that some of the low- 
order ftiaction bits are lost. Consequently, whenever 
the destination format lacks enough bits to accommo- 
date the IPR, rounding introduces quantization errors, 
which in turn result in system noise. Consider, for 
example, the multiplication of two numbers in a 4-digit 
decimal system: 

(0.8102X I03)x(0.8001x 10-')=0.6410401x 10-*. 

The IPR is rounded to 0.6410 x 10'* to fit the destina- 
tion format, thus introducing a quantization error. In 
practice, quantization errors during a long computation 
will be random, and the overall effect will be analogous 
to an increase in system white noise. If the quantization 
errors are not random, they may appear as System 
nonlinearities and, as a consequence, cause serious 
problems in such applications as spectral analysis. 

Are quantization errors data dependent? 

Mathematical analysis of an integer system shows 
that quantization errors due to rounding have a mean 
value of one-quarter the value of the least significant 
bit. The relative error at each rounding thus depends 
on the magnitude of the operand being rounded. There- 
fore, as the magnitude of the operand decreases, the 
relative quantization error increases. The same is true 
of a block floating-point system, in which denormalized 
operands may contain leading zeros. In integer and 
block-floating-point systems, therefore, the errors are 
data-dependent, and for this reason error analysis is 
both difficult and time-consuming. 

In true floating-point systems, however, operands 
are generally normalized, so the relative quantization 
errors are the same, regardless of the magnitude of the 
operands. Quantization error analysis in floating-point 
systems is thus data independent and therefore doesn't 
require complicated worst-case simulations. 

Floating-point systems can suffer from a computa- 
tional drawback known as the "operand ordering prob- 

EDN .fenuary 9, 198S 
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lem." Consider the addition of three floating-point 
numbers: A ( = 1), B (=2»), and C (= -2"). You may find 
that (A+B)+C=0, although A+(B+C)=1. This result 
clearly violates the associative law of addition. The 
discrepancy occurs because the floating-point standard 
doesn't have enough bits to accommodate the interme- 
diate result of the first calculation (A+B). The hard- 
ware has to round the IPR, ^+1, to the nearest 
representable number, which is 2". Errors of this kind 
are inevitable whenever the IPR has to be rounded to 
fit the destination format, although they would usually 
be considered so small as to be unimportant. 



You can minimize rounding errors (although, as the 

previous example shows, you can't entirely remove 

them) by a judicious choice of rounding mode. Some 

floating-point standards allow you to select from among 

several rounding modes the one that best suits your 

operation. All of the commonly used floating-point 

standards support one or more of four modes: 

• Round-to-nearest mode replaces the IPR with the 

closest representation that fits in the destination 

format. In the case of an IPR that falls exactly 

halfway between two representations, the IEEE 

standard rounds the IPR to the representation 



Dealing with underflow and overflow 



Fbr the rare cases in which the 
result of a calculation is too 
large or too small to be repre- 
sented, you must have previous- 
ly specified the way in which 
your system will deal with that 
result. In short, your system 
must handle the related prob- 
lems of underflcw and overflow. 

Underflow arises when the 
rounded result of an operation is 
a number between zero and the 
smallest representable norma- 
lized number. You can handle 
such a number in one of two 
ways: You can set the number to 
zero (sudden underflow), or you 
can represent the rounded result 
by a denormalized number 
(gradual underflow). 

Overflow occurs when the 
rounded result of an operation is 
greater than the largest repre- 
sentable number. You can handle 
this problem by setting the re- 
sult to infinity, which implicitly 
terminates a chain of calcula- 
tions, or by saturating the result 
to the latest representable 
number (correctly signed). 

It's important to know which 
of the various methods your sys- 
tem 8U]^rts, because in some 



applications sudden underflow or 
saturated overflow can destroy 
the accuracy of an entire series 
of calculations. The IEEE stan- 
dard, for example, treats under- 
flows by invoking the gradual 
underflow method, while the 
IBM and DEC standards deal 
with only sudden underflow. 

Sudden underflow is generally 
the fastest method of treating 
underflows and is acceptable in 
the m^ority of systems because 
high accuracy is seldom required 
for very small numbers. Sudden 
underflow can produce quantiza- 
tion errors almost as large as 
the smallest normalized number, 
but usually you can treat these 
errors as insignificant. 

The gradual-underflow method 
creates much smaller errors be- 
cause it roimds results to a nor- 
malized number. On the other 
hand, gradual underflow is more 
difficult and more expensive to 
implement than sudden under- 
flow, a drawback you'll have to 
weigh against the advantage of 
accurate results over a wider 
range of numbers. Gradual 
underflow is generally best for 
iterative applications in which 



you drive a residual value to 
zero and for which you require 
maximum possible accuracy. 
When such a residual value 
underflows gradually to zero, 
you know that it's negligible 
compared with every normalized 
number. 

Fbr handling overflow, data- 
processing applications generally 
set the result to infinity, because 
in a high-accuracy mathematical 
model a saturated result could 
destroy the accuracy of an entire 
series of calculations. In real- 
time digital signal processing, 
however, it's generally prefera- 
ble to saturate the r^ult and 
continue the chain of calcula- 
tions. In the analysis of radar 
returns, for example, you would 
certainly not want a single 
anomalous return to bring the 
entire processing sequence to a 
halt by introducing an operand 
(an infinity) that would be use- 
less in further processing. In 
this and similar applications, it's 
often better to have an approxi- 
mately correct data point than 
no data point at all. 
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TABLE 2— NUMBER REPRESENTATION 
IN FOUR FLOATING-POINT STANDARDS 

IEEE FORMAT 



BIT 



r 31 


30 


29 


28 


27 


26 


25 


24 


23 


22 


21 


20 


19 


1 s 


2' 


2« 


2' 


2* 


2' 


2' 


2' 


2° 


2-1 


2-2 


2-3 


2-4 



h 



SIGN 
S 



BIASED EXPONENT 
(E) 



FRACTION 
(F) 



E = AND F = V = (-1)^ • (-0, +0) 

E = AND F*0 V = (-1)* • OF ' 2'" (DENORMALIZED) 

< E < 255 V = (-1)? ■ 1,F • 2^-'" (NORMALIZED) 

E = 255 AND F = V = (-1)^ • 00 (-00. +00) 

E - 255 AND F#0 V = NaN (NOT-A-NUMBER) 



(a) 



BIT 



DEC FORMAT 



r 31 


30 


29 


28 


27 


26 


25 


24 


23 


22 


21 


20 


19 


1 s 


1 2' 


2« 


2' 


2« 


23 


2» 


2' 


2» 


2-2 


2-3 


2-4 


2-5 



SIGN 
S 



BIASED EXPONENT 

(E) 



FRACTION 
(F) 



S = 1 AND E = V = DEC RESERVED OPERAND 

S = AND E = V = 

E>0 V = ^-^f • OIF • 2^-™ (NORMALIZED) 



0) 



IBM FORMAT 



r 31 


30 


29 


28 


27 


26 


25 


24 


23 


22 


21 


20 


19 


s 


1 2« 


2= 


2* 


2^ 


2' 


2' 


2» 


2-1 


2-2 


2-3 


2-4 


2-5 



SIGN 
8 



BIASED EXPONENT 
(E) 



FRACTION 
(F) 



F = V = (-I)S-O(-O, +0) 

F*0 V = (-1)^ • 0.F • 16^-" 



(c) 



MIL-STD-1750A FORMAT 



29 



28 
~2~ 



11 


10 


9 


8 


7 


6 


5 


4 . 


3 


2 


1 





2-20 


2-21 


2-22 


2-23 


-2' 


26 


25 


2> 


23 


22 


2' 


2° 1 



FRACTION 
(F) 

= F-2E 



EXPONENT 
(E) 



(d) 



EDN January 9. 1986 



6-105 



CHAPTER 6 
Articles/Application Notes 



having an LSB of zero, whereas the DEC stan- 
dard rounds the IPR to the representation that 
has the greater magnitude. 

• Round-to-minus-infinity mode rounds the IPR to 
the closest representable value that is less than or 
equal to the IPR. 

• Round-to-plus-infinity mode rounds the IPR to 
the closest representable value that is greater 
than or equal to the IPR. 

• Round-to-zero mode is analogous to truncation; it 
rounds the IPR to the closest representable value 
with a magnitude less than or equal to that of the 
IPR. 

As noted earlier, the various floating-point standards 
specify different binary representations of floating- 
point numbers, and you'll have to match their respec- 
tive advantages and disadvantages to your own compu- 
tational problems. The four of the most common binary 
floating-point standards, the IEEE, DEC, IBM, and 
MIL-STD-1750A standards, all represent single-preci- 
sion, floating-point numbers by means of 32-bit words 
having the formats shown in Table 2. All four standards 
support double-precision data, and some of these stan- 
dards also support other data types, such as single- 
extended and double-extended data. 

The IEEE working group presented the specifica- 
tions contained in proposed standard P754, draft 10.1, 
as a robust standard for portable floating-point soft- 
ware. This proposed standard has received wide ac- 
ceptance, and it's likely to form the basis of a large 
number of future hardware implementations. P754 has 



Biased exponents and normalized fractions 
^ive true floating-point systems a clear ad- 
vanta0e over integer and block-floatin£j- 
point systems. 

several features that aren't found in other standards. In 
particular, +0, -0, and infinities are all valid operands. 
Operations performed on infinities signal no exceptions 
unless the operation itself is invalid. The standard 
allows the use of a special operand known as NaN 
(Not-a-Number). An implementation should interpret 
NaNs as signals rather than numbers, and it should use 
NaNs to indicate invalid operations or to pass status 
information through a series of calculations. Also, the 
standard accepts denormalized numbers as a represen- 
tation of a result that is less than the smallest norma- 
lized number. 

The DEC standard is implemented in all DEC VAX 
minicomputers; the VAX Architecture Manual contains 
the full specifications of the standard. Conceptually 
simpler than the IEEE standard, the DEC standard 
has no provisions for infinities or denormalized num- 
bers, and it has only a single representation for zero. 
The DEC standard does, however, incorporate DEC 
reserved operands, which are analogous to IEEE 
NaNs. 

An important feature common to both the IEEE and 
the DEC standards is the existence of a hidden bit. 
Both standards specify that all operands will be norma- 
lized (except for denormalized numbers in the IEEE 
format). This stricture implies that the leading fraction 
bit must always be a one. This bit would not only be 
redundant if included in the 32-bit representation, but 
it would actually reduce the precision of the number, so 
its presence is assumed. In the case of IEEE denor- 
malized numbers, the biased exponent is zero, thereby 

continued, page 6-106 



TABLE 3-COMPARISON OF FLOATING-POINT STANDARDS 





IEEE 


DEC 


IBM 


1750A 


LARGEST 
POSITIVE 
NUMBER 


2«»_2i« 


2'" -2"° 


^-7?^ 


2l27_g1ll3 


SMALLEST 

POSITIVE 

NUMBER 


2-.« 


2-,je 


2-280 


2-129 


LARGEST 
NEGATIVE 
NUMBER 


_2,a^2,04 


.2'2'+2«» 


_2»3^22» 


_2t27 


SMALLEST 
NEGATIVE 
NUMBER 


-2-'« 


_2-ia 


_2-^ 


_2-i» 


DYNAMIC 
RANGE 


^ 


2^ 


2533 


2256 


PRECISION 


2-« 


2-23 


2-20 


2-23 
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VLSI floating-point |xP for recursive algorithms 

One example of floating-point 
hardware that handles recursive 
algorithms is the Am29325 from 
Advanced Micro Devices. The 
processor integrates a 32-bit 
adder/subtracter, a multiplier, 
and a data path on a single chip. 
This level of integration reduces 
the processing overhead in- 
curred by chip sets comprising 
separate ALU and multiplier 
chips. The internal feedback 
paths facilitate the implementa- 
tion of such recursive algorithms 
as sum-of-products and Newton- 
Raphson division. 

The processor supports both 
the IEEE and DEC floating- 
point formats. The instruction 
set includes instructions that 
convert data from IEEE format 
to DEC format and vice versa, 
as well as instructions that con- 
vert data to and from 32-bit in- 
teger format. 

Three functional blocks 

The processor has three main 
functional blocks (Fig A): a' 
floating-point ALU, a status-flag 
generator, and a 32-bit internal 
data path. The ALU is fully 
combinatorial, and it performs 
all instructions in a single cycle. 
The eight instructions handle 
floating-point R-i-S, R-S, RxS, 
and 2-S operations as well as 
the format conversions. 

The 2-S instruction forms the 
core of the Newton-Raphson di- 
vision algorithm, which performs 
division by a sequence of itera- 
tions. In this and other iterative 
algorithms, intermediate results 
are retained in the R or S regis- 
ter, thereby eliminating the 
need for any off-chip registers 
and minimizing the number of 
required data transfers. 

Three programmable I/O 
modes allow the Am29325 to in- 
terface with a variety of sys- 
tems. The 32-bit, 2-input-bus 
mode uses three separate 32-bit 



s,.„ 
9 . 



STATUS-FLAG 
QENEFIATOn 



/ 




Ffl.ji 



STATUS FLAGS 



Fig A — Tlua VLSI floating-point proceator it fmt because ii contains all the major 

components for ig-Mt opemtiona on a single chip. It has arte input for an external clock 
and IT inputs for iTiatruciion-aelect and control functions. 



buses (R, S, and P) for high- 
speed, nonmultiplexed operation; 
in this case, the R and S regis- 
ters are configured as indepen- 
dent 32-bit ports. In the 32-bit, 
1-input-bus mode, both the R 
and S registers are connected to 
a common 32-bit input bus; the 
host multiplexes operands onto 
this bus. In the 16-bit, 2-input- 
bus mode, 32-bit operands are 
multiplexed onto the correspond- 
ing 16-bit buses (low-order bits 
first). 

Six flags and four modes 

The status-flag generator pro- 
vides six ftiUy decoded flags. 
Four of these flags report excep- 
tional conditions, as defined in 



the IEEE standard. The remain- 
ing two flags identify zero-val- 
ued or nonnumerieal results. 

The Ajn29325 implements the 
four IEEE-mandated rounding 
modes: round-to-nearest, round- 
to-plus-infinity, round-to-minus- 
infinity, and round-to-zero. The 
same four modes are supported 
for the DEC standard, except 
that when the infinitely precise 
result is halfway between two 
representable numbers, the 
IEEE round-to-nearest mode 
rounds to the closest representa- 
tion with an LSB of zero, 
whereas the DEC round-to-near- 
est mode rounds to the value 
with the larger magnitude. 
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instructing the system to assume that the value of the 
hidden bit is also zero. 

The IBM floating-point standard differs from its 
IEEE and DEC counterparts in several respects. It has 
no provision for infinities or reserved operands, al- 
though it does accept denormalized numbers. More 
important, however, are the absence of a hidden bit and 
the use of radix 16 rather than radix 2. Because the 
exponent of an IBM number is expressed as a power of 
16, the standard has a large dynamic range. For the 
same reason, however, numbers are spaced farther 
apart than in the other formats. This increased gran- 
ularity results in less precision than is provided by the 
IEEE and DEC formats. Also, the use of radix 16 
allows as many as three leading zeros in the binary 
fraction of a normalized number, even though the 
leading hexadecimal digit is nonzero if the number is 
expressed in hexadecimal format. The leading binary 
zeros can cause the precision to vary from one operand 
to another. This variation is knovra as wobbling. 

The MIL-STD-1750A standard, developed for use in 
military systems, allows no reserved operands, infini- 
ties, or denormalized numbers. Furthermore, the use 
of a 2's-complement fraction, rather than a sign-magni- 
tude representation as in the other three formats. 



requires a somewhat different hardware architecture. 
The applications to which each of the four standards 
is best suited differ quite widely. Nevertheless, you can 
make a simple comparison (Table 3) between the 
standards, based on factors such as the largest and 
smallest representable numbers, the dynamic range, 
and the precision. Such a comparison can be useful in 
selecting the most suitable format for a given applica- 
tion. In most cases, however, the format to be used is 
determined by outside constraints, such as compatibili- 
ty with existing hardware or software. 
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Designer's Guide to: 
Floating-point processing — Part 2 



Floating-point an^ay 

processor improves 

computational power 



Powerful math-processing chips configured with high- 
speed memories and controllers form the core of a 
floating-point math or array processor for small 
computers. This second part ofEDN's 3-part float- 
ing-point math Series discusses the tradeoffs you 
must make to add flexibility and speed to array- 
processor ( 



Robert M Pferlman, Advanced Micro Devices 

For such jobs as digital-signal processing, image pro- 
cessing, graphics, and scientific calculations, an array 
processor can take over repetitive arithmetic chores 
while your host computer performs control tasks and 
retrieves information. By employing a floating-point 
array processor, you also increase the math-processing 
power of your computer system. 

The basic array-processor design (Fig 1) contains an 
arithmetic unit, a controller, data memory, program 
memory, and a host interface (see box, "Array pro- 
cessor vs general-purpose computer"). If you use newer 
control, memory, and math chips, you can fit the circuit 
on a single pc board. This array-processor design uses 
an Ani29325 floating-point processor chip, which oper- 

Reprinted with permission from EDN, January 23, 1986. Copyright 1986, Reed 
Publishing USA. 



ates with either IEEE- or DEC-standard single-preci- 
sion data. The chip performs single-cycle floating-point 
additions, subtractions, multiplications, and format 
conversions at an &-MHz clock frequency. 

Because the Ain29325 chip contains a floating-point 
arithmetic unit (AU), three 32-bit registers, two data 
buses, and two data-selection multiplexers, you need 
only a small amount of external hardware to design a 
complete math- or array-processor circuit. In the 
array-processor design, the Am29325 receives oper- 
ands from two high-speed memories. An 8kx32-bit 
RAM provides input data for your algorithms, and it 
stores intermediate and final results. An 8k x 32-bit 
PROM provides constant values for the algorithms. 

Although you can design a circuit that specifically 
controls the math chip and its associated memory chips, 
you'll find ap equivalent circuit in the 2910A micropro- 
grammable controller chip. The 2910A chip is a general- 
purpose controller; it's not dedicated to controlling the 
Am29325. The controller chip contains a program 
coimter, a loop counter, a LIFO stack, and other 
circuits that access program instructions and control 
the array processor in the basic design. The controller 
provides an 11-bit address for the design's 2kx64-bit 
microprogram memory, which contains the instructions 
for your algorithms. Each algorithm instruction con- 
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A bask array processor speeds math opera- 
tions by performing repetitive tasks quickly. 



tains 64 bits that the circuit divides into seven groups of 
outputs: 

• U jump address bits 

• one address and write-enable multiplexer bit 

• one write-enable control bit 

• 13 RAM-address bits 

• 13 PROM-address bits 

• 24 miscellaneous control bits 

• one interrupt-control line. 

The microprogram memory routes its outputs 
through an internal register and then to the rest of the 
array-processing hardvpare. Although it may not be 
obvious, the register at the microprogram memory's 
output helps maintain high-speed data processing. By 
using a clocked register to hold the memoiys output 
bits, the controller latches a 64-bit instruction while it 



addresses the microprogram memory for the next 
instruction. The memory's output register therefore 
permits the overlap of the instruction-fetch and -exe- 
cute operations, which saves processing time. 

Because it holds information for a pending operation, 
the microprogram memory's output register is often 
referred to as a pipeline register. Array processors can 
contain a series of pipeline registers, the number of 
which depends on the architecture of the array pro- 
cessor and the maximum processing speed you need. 

Host interface links processors 

You must carefully choose your host-computer inter- 
face circuits according to the type of system bus in your 
computer. You can accommodate most general-purpose 
computers by providing bus buffers for the address, 
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Fig l—Th eAm29325 noating-point processor used in this design adheres to IEEE ami DEC floating-point standards. 
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TABLE 1— 
BENCHMARK EXECUTION TIMES 



OPERATJON 


EXECUTION TIME 


5-TAP FIR FILTER 


1.125 mSEC 


RADIX-2 FFT BUTTERFLY 


1.25 ^SEC 


4x1 MATRIX ADDITION 


LO^SEC 



4x4 MATRIX MULTIPLICATION 



14.0 pSEC 



data, and control lines. You'll also need a small amount 
of control logic to manage the flow of information to and 
from the array processor and the host computer. For 
example, you can construct a Multibus interface by 
using octal bus buffers and PAL chips. If your host 
computer's data bus contains fewer than 32 data bits, 
you'll need to convert the data to and from the 32-bit 
format that the array processor requires. You can 
include double-buffer latch circuits for the data inputs 
to the array processor, and you can provide latches and 
multiplexers on the processor's data-output lines. 

The host computer's data bus provides the main link 
between the host and the array processor. Your com- 
puter starts a math operation by loading the RAM with 
raw data and then signaling the array processor to 
start a math-processing algorithm. After the processor 
runs an algorithm program, your host computer reads 
the RAM's contents to obtain the results. 

To simplify the data-transfer operations to and from 
the host computer, the array processor goes into an 
idle, or standby, state when it isn't running an algo- 
rithm program. Instead of controlling the processor's 
data and control lines, the microprogram controller 
continuously runs a l-microinstructlon program loop. 
In addition, the idle microinstruction switches the 
RAM's address and write-enable multiplexers so that 
the RAM appears to be part of the host computer's 
main memory. The host computer loads the desired 
input data into the data RAM, and it then loads the 
microprogram controller with the starting address of 
the algorithm you want to run. The microprogram 
controller then jumps to the preprogrammed sequence 
of microinstructions for the algorithm. The algorithm's 
first microinstruction reconfigures the data RAM so 
that only the array processor can address it. When the 
algorithm completes its tasks, it sends an interrupt 
signal to the host processor, switches the data RAM 
back to the host, and executes the 1-instruction standby 
loop. 

Once you're sure the array processor is operating 

EDN January- 23, 1986 



properly, you can test the operating speed of your 
circuit by using benchmark programs tailored to specif- 
ic tasks (Table 1). The benchmark times were calcu- 
lated for the array processor with an 8-MHz clock 
frequency. The basic processor performs one data- 
RAM operation (read or write) per clock cycle. 

Modiflcations improve performance 

Although the basic array-processor circuit works 
well, you can improve its performance. The ability to 
take data addresses directly from the program memory 
in the simple array processor means that the program 
memory must contain a section of microcode for each 
iteration of an algorithm. Fbr example, a program that 
performs 20 matrix multiplications contains a separate 
section of microprogram code for each multiplication 



PROGRAM 
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^ 



PROGRAM 
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M 



4 ^ 



Fig 2 — You can implement the program memory in tim ways: 

Either you can incliide steps for each iteraiion of your algorithm fa), 
or you cayi add an address-generator circuit (b) that lets you use only 
one section of code for all iterations. Tlte address generator locates 
specific values and coefficients in memofry automatically. 

step. Each code section contains specific addresses for 
data and coefficients (Fig 2a). The in-line coding ap- 
proach therefore wastes program-memory space. 

One improvement found in virtually every array 
processor is a data-address-generator circuit that gen- 
erates the necessary data and coefficient addresses 
within the array processor. The address-generator 
hardware reduces the amount of microprogram memo- 
ry you'll need for an algorithm. By using such hard- 
ware, the processor performs multiple iterations of an 
operation by looping through the same section of micro- 
code as many times as necessary (Fig 2b). 

Depending on your specific tasks, you can choose a 
data-address generator that fits a specific algorithm, 
such as the fast Fourier transform (FFT), or you can 
choose a general-purpose addressing device. Some 

continued, page 6-1 14 
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Array processor vs general-purpose computer 



lb understand better what an 
array processor does, consider 
first the strengths and short- 
comings of general-purpose com- 
puters. General-purpose comput- 
ers incorporate the standard Von 
Neumann architecture and per- 
form a variety of tasks. Such 
computers perform instruction- 
fetch and instruction-execution 
tasks sequentially, with instruc- 
tions and data available in one 
memory array (Fig A). 

Consider the calculation of the 
sum of products, a common task 
in signal-processing and matrix- 
manipulation algorithms. The 
basic sum-of-products equation is 
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Fig A— A general-purpose computer memory stores instructions and data in the same 
block. The computer must access instruction and data values seguentially. 



Y = i kiXi, 

i-l 

where ki and xi represent coeffi- 
cients and data stored in memo- 
ry, respectively. The sum-of- 
products computation represents 
a large class of array-processing 
problems that share three funda- 
mental characteristics: First, 
they involve repetitive computa- 
tions on arrays of data. Second, 
the underlying control structure 
is simple, having many loops but 
no conditional branches. Third, 
the math steps are memory-in- 
tensive — each calculation re- 
quires one data point and one 
constant from memory. 

To evaluate a product term, 
the computer fetches x, and ki, 
multiplies them, and then adds 
the result to the running total. 
Each step requires an instruc- 
tion-fetch cycle and an instruc- 



tion-execution cycle. Although 
specific details vary from com- 
puter to computer, in general 
even primitive math operations 
require many cycles. 

Overlapping operation 

Traditipnally, Von Neumann- 
type computers perform each 
step sequentially. Array pro- 
cessors, however, provide a de- 
gree of parallelism by doing 
more than one thing at a time. 
When data and program steps 
reside in separate memories — an 
arrangement that fits the Har- 
vard-architecture model — ^in- 
struction- and data-fetch opera- 
tions can overlap (Fig B). In the 
case of the sum-of-products op- 
eration, the array processor 
fetches the input operands at the 
same time that it fetches the in- 
struction that performs the mul- 
tiplication. Most array proces- 
sors also overlap instruction- 
fetch and instruction-execution 
operations. 

For highly regular, math-in- 
tensive algorithms, the overlap- 
ping results in high-speed opera- 
tion, but such operation can be 
inefRcient when the algorithm 
includes conditional branches. If, 



for example, a program calls fpr 
a conditional branch to another 
instruction, the instruction fol- 
lowing the branch instruction 
may be in the instruction queue. 
If it is in the queue, the comput- 
er discards it. Array processors 
are therefore best suited to the 
many number-crunching algo- 
rithms that require little or no 
conditional branching. 

Because array processors pro- 
vide parallel operation, you can 
optimize them for a specific 
math process. For example, an 
array processor designed for a 
sum-of-products operation may 
contain a multiplier and adder 
circuit, which evaluates a prod- 
uct term in one cycle. Because 
array processors perform paral- 
lel operations, programming the 
processors is more demanding 
than programming a general- 
purpose computer. However, the 
resulting increase in computa- 
tional power often justifies the 
additional programming effort. 
Instead of programming in Basic 
or in assembly language, you'll 
use a microcode that controls in- 
dividual circuits and operations 
in the array processor. Although 
such programming is demand- 
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ing, it gives you complete con- 
trol of the array processor's in- 
ternal operations. 

Five functional blocks 

Array processors typically re- 
ceive data and instructions from 
a host machine — ^usually a 
general-purpose computer. Al- 
though specific array-processor 
architectures vary greatly, most 
processors contain at least five 
functional blocks: an arithmetic 
unit, data memory, a controller, 
program memory, and a host 
interface. 

The heart of the processor is 
the arithmetic imit, which con- 
trols the data paths and per- 
forms arithmetic operations. De- 
pending on your application, the 
arithmetic unit performs fixed- 
point operations, floating-point 
operations, or both. For some 
high-speed, real-time applica- 
tions, such as radar- and video- 
information processing, array 
processors operate on 12-, 16-, 
or 24-bit fixed-point data. How- 
ever, the trend is toward 32-bit 



floating-point data processing. 
The data-memory — usually 
banks of high-speed RAM or 
PROM — supphes operands to the 
arithmetic unit and stores re- 
sults from the arithmetic unit. 
The data memory can have mul- 
tiple data ports, depending on 
how fast the memory chips must 
supply operands and accept re- 
sults. If it doesn't have enough 
ports or enough speed, the data 
memory can become a process- 
ing bottleneck, leaving the arith- 
metic unit starved for operands. 

Controller is simple 

The controller sequences the 
array processor through its op- 
erations. Because most array- 
processing algorithms have mod- 
est sequencing requirements, 
the controller isn't complex. 
Controllers provide a program 
counter (PC) that you increment 
to access the next program- 
memory word. You can also load 
the PC with the program memo- 
ry's output to force the control- 
ler to jump to a different part of 



PROGRAM 
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DATA 
MBllORY 



INSTHUCTIOW I INSTRUCTION | INSTRUCTION [ INSTRUCTION | • 
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Fig B^An army processor's meimry provides separate storage blocks for instruc- 
ticms and data. The separate storage areas let the control circmts access instructions 
and data in parallel. 



the program. The controller in- 
cludes a loop counter, which 
counts repeated operations. De- 
pending on the array processor's 
sophistication, the controller 
may incorporate circuits that 
control nested subroutines, in- 
terrupts, and conditional-branch 
operations. 

The program memory stores 
the array processor's microcode, 
which controls the other pro- 
cessor elements. Like the data 
memory, the program memory 
can be RAM or PROM. Use 
PROMs when the algorithms are 
well-defined and unlikely to 
change. Use RAM during algo- 
rithm development. The re- 
sources in the array processor 
determine the microcode memo- 
ry's bit width. For example, a 
60-bit-wide program memory 
provides 30 bits that control the 
arithmetic unit, 15 bits that 
transfer information to the con- 
troller (including a 12-bit jump 
address), and 15 bits that control 
other internal array-processor 
resources. 

The host mterface transfers 
data and instructions between 
the host computer and the array 
processor — ^usually by DMA op- 
erations. The host computer 
sends the array processor a 
block of data and an instruction 
word that selects a processing 
algorithm. After processing the 
data, the array processor trans- 
fers the results to the host 
computer. 
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An array processor can include pipeline reg- 
isters that let the circuit overlap tasks. 
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array processors provide both a general-purpose and a 
dedicated address-generator circuit. You'll find sepa- 
rate address generators for data and coefficient memo- 
ries in array processors that provide extremely high 
processing speeds. 

An address generator reduces the size of your array 
processor's program memory, and it increases the 
processor's speed. To increase processing speed fur- 
ther, consider adding arithmetic hardware to your 
design so the processor can do several computations in 
parallel. In the basic array-processor design, the arith- 
metic unit performs one operation at a time — for exam- 
ple, sums of products, which involve alternate addition 
and multiplication operations. The array processor per- 
forms the multiplication and addition operations se- 
quentially. 

The throughput of the basic array processor is 250 
nsec per floating-point product term; to increase that 



speed you can gang two 29325 floating-point math 
processors (Fig 3). The processors communicate 
through a 6-port RAM. When t^e circuit incorporates a 
multiport RAM, the floating-point processors can each 
access two input operands and store one result during 
each clock cycle. Because data produced by one float- 
ing-point processor is accessible to the other, you can 
double the processing speed for such algorithms as 
sum-of-products: One processor produces product 
terms, while the other processor sums and accumulates 
them. Of course, you can choose other math-chip config- 
urations that better suit specific array-processing 
tasks. Keep in mind, however, that although you gain 
higher-speed operations by providing parallel math 
chips, your programming tasks grow. Coordinating the 
software operations of several parallel math chips can 
be difficult. 

Memory expansion increases throughput 

When you upgrade the arithmetic unit by adding 
parallel math chips, you must improve the data memory 
as well. The data-memory configuration in the basic 
array processor limits processing speed because the 
processor only accesses one constant and only performs 
one RAM-read or -write operation per clock cycle. To 
let the array processor perform operations that require 
two operands from RAM in the same cycle, or that 
require RAM-read and -write operations during the 
same cycle, you must upgrade the memory. Possible 
enhancements include converting the coefficient PROM 
to high-speed RAM, running the data RAM at twice 
the processor's speed to allow single-cycle reading and 
writing, or replacing the data RAM with a 2-port 
RAM. 

In addition to high processing speeds, some applica- 
tions may require rapid data transfers between the 
array processor and the host computer. There are at 
least two ways of speeding the transfer of data from the 
host to the array processor. First, you can replace the 
array processor's data RAM with a 2-section memory 
(Fig 4) that gives the host computer access to one 
section while the array processor uses the other. When 
the array processor completes its task, it switches 
between the buffers. The host obtains the results from 
the array processor's old buffer, while the processor 
operates with the data in the host's old buffer. The host 
computer's and the array processor's operations are no 
longer sequential; instead, they overlap. You'll have to 
pay careful attention to the manner in which the array 
processor controls the 2-section memory, because you 
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don't want to switch buffers while the host or the array 
processor is still using one. 

A second approach involves bypassing the host com- 
puter and letting the array processor take data directly 
from the data source — for example, an A/D converter. 
The processor uses the data and passes results to the 
host computer. 

The 2-section-memory and direct-data-input tech- 
niques aren't mutually exclusive. In a given application, 
you might send data from an A/D converter directly to 
a 2-section memory. In this case, when the AJB con- 
verter's memory is full, it switches the memory section 
to the array processor. 

Dividing the work load 

By adding both direct-data input and output ports to 
your array-processor design, you can connect several 
processors in series, letting each one perform a subset 
of your algorithm. After it processes a piece or block of 
information, each processor passes results to the next 
processor in the chain. 



The basic array processor performs addition, sub- 
traction, multiplication, and format-conversion opera- 
tions. For complex and transcendental operations, 
you'll need specific microcode routines that offer cosine, 
sine, and other functions. Standard algorithms are 
available, so your programming tasks aren't insur- 
mountable. Part 3 of EDN's floating-point series will 
explore transcendental functions and tell how to imple- 
ment them. 
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Floating-point jjlP 

implements high-speed 

math fijnctions 



This final article in a 3-part series describes how to 
incorporate a floating-point processor into your sys- 
tem. It discusses criteria for the selection of the algo- 
rithms you'll use, and in particular it details the 
methods used to implement transcendental functions. 



David Quong, Advanced Micro Devices 

If your application must perform a variety of math 
functions at high speeds on a wide range of input data, 
consider designing a math subsystem based upon a 
VLSI floating-point processor. A floating-point pro- 
cessor, a microsequencer, RAM, and ROM, configured 
as shown in Fig 1, together with the appropriate 
algorithms, will allow you to perform most math func- 
tions at real-time speeds with high precision and a very 
large dynamic range. A system of this type will outper- 
form even the fastest floating-point coprocessor. 

The choice of algorithms is an important step in the 
realization of your math processor. You can choose from 
a variety of methods for implementing transcendental 
and other math functions; The Taylor series, the 
Chebyshev series expansion, and the Newton-Raphson 
approximation are just a few of the many possible 
approaches. Which algorithm is the best one for your 
particular application will depend upon what functions 
you want to perform, the hardware architecture you are 



using, and the system throughput and accuracy you 
expect to receive. 

Many designers select the Taylor series for perform- 
ing math functions. This well-known method allows you 
to find equations for various functions in most books of 
math tables. The Taylor series has a m^or drawback, 
however: It has a nonuniform convergence rate in the 
number of terms needed to achieve a desired accuracy. 
Consider, for example, the Taylor series expansion of 
the sine function: 



sin(x) 



_ xj xf _ 2Ll 
^ 3! 5! 7! 



For values of x near zero radians, this equation 
converges very quickly, but as x becomes larger, you'll 
need a larger number of terms to evaluate sin(x) to the 
same accuracy that you obtained for the smaller values. 

The Chebyshev expansion method, like the Taylor 
method, produces a polynomial approximation, but it's 
not so well known. The generation of the Chebyshev 
approximation for a particular function is more complex 
than for the Taylor series, but the resulting polynomial 
is just as easy to implement. The major advantage of 
the Chebyshev method is that it has uniform conver- 
gence. Moreover, for any given function, over the 
operating range of the Chebyshev series this method 
yields smaller errors than almost any other method. 
You can usually determine by inspection the upper 
bound of the error; the error of the truncated series 
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A math-processing subsystem incorpomtm£[ 
a VLSI fioatin0-point processor will outper- 
form even the fastest available floating- 
point coprocessor. 

cannot exceed the sum of the absolute values of the 
remaining Chebyshev coefficients. (For details of the 
derivation of the Chebyshev series, see box, "Deriving 
a Chebyshev series.") 

Iteration handles simple functions 

For some simple functions such as division and 
square-root extraction, the Newton-Raphson method, 
an iterative approach for approximating such functions, 
works well. When using this or any other iterative 
method, you have to start with a seed, or initial 
approximation. The better this approximation is, the 
faster will be the convergence. You can store predeter- 
mined seed values in a look-up table. This method 
usually requires extra hardware (in the form of ROMs), 
but it gives you flexibility, because you can store seed 
values that are as accurate as you want. 

The chief attraction of the Newton-Raphson method 
is its rapid convergence; the number of iterations 
required is low. The method converges quadratically. 



ie, the order of the error is squared by each iteration. 
For example, if the seed is accurate to eight bits, the 
first iteration improves the accuracy to 16 bits, and the 
second iteration improves it to approximately 32 bits 
(variance depends on the magnitude of the error). 

The math processor shown in Fig 1 evaluates 
Chebyshev and Newton-Raphson approximations very 
efficiently. The system performs transcendental (trigo- 
nometric, logarithmic, and exponential) functions by 
the Chebyshev method and division and square-root 
extraction by the Nevrton-Raphson method. 

Understand the algorithms 

The algorithms for 10 very common math functions 
are described below. You'll need these functions for 
applications associated with navigation, guidance, 
image processing, signal processing, and many other 
areas. The algorithms for the transcendental functions 
are based on the Chebyshev method and consist of a 
3-stage process. The first stage reduces the range of 
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Deriving a Chebyshev series 




The Chebyshev series expansion 


upon the number of terms you 


Substituting the T„x 


is a procedure for generating a 


use. (If you are interested in a 


polynomials into the Chebyshev 


polynomial approximation for a 


formal derivation of the 


series gives 


given math function, f(x). To 


Chebyshev method, see Refs 1 




expand the function, you must 


and 2.) 


sin(^4irx) = 


express it as a Chebyshev 




0.5Co + Cix + C2 (2x2 _ 1) 


series: 


Expansion for sine function 


+ C3 (4x' - 3x) 
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f(x)=0.5C«+CiT,(x)+C2T2(x)+. . . 


Chebyshev expansion for the 
sine function, first go to the 
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coefficient tables in Ref 2 and 
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Chebyshev polynomial of degree 


look up the coefficients for the 




n given by 


sine function (or calculate them 


sin(V4mc) = ao + a.x + 32x2 




from the formula given above). 
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The Chebyshev expansion method, like the 
Taylor method, produces a polynomial ap- 
proximation, but ifs not so well known. 

the input arguments to values between +1 and -1, 
because the Chebyshev expansion operates only over 
this range. The second stage evaluates the polynomial 
derived from the Chebyshev expansion. The third 
stage performs any postprocessing that may be re- 
quired, such as correction of the sign. 

The detailed descriptions were developed by 
Clenshaw, Miller, and Woodger (Ref 1). They use the 
terms END and CSERIES: END indicates that the 
result of the operation must be rounded towards minus 
infinity, and CSERIES indicates that the Chebyshev 
series for the input must be evaluated. 

Range reduction prepares arguments 

The range-reduction steps for the sine function are 

• x=x(2/ir) 

• x=x-(4(RND(0.25(x+l)))) 

• If x>l then x=2-x. 

As noted, these steps reduce the input argument to the 
range -Isxsl. You then evaluate the sine function by 
summing the terms of the following polynomial equa- 
tion derived for the sine function: 

sin(x)=x(CSERIESs,„(2x2-l)). 

The range-reduction steps for the cosine function are 

• x=x(2/ir) 

• x=4(RND(0.26(x+2)))-x+l 

• If x>l thenx=2-x. 

You then evaluate the cosine function by using the same 
polynomial equation as for the sine function: 

cos(x)=x(CSERIES,„(2x='- D). 

The range-reduction steps for the tangent function 
are 

• x=x(2/t7) 

• x=x-(4(RND(0.25(x-H)))) 

• y=x 

• If x>l then x=2-x. 

The Chebyshev polynomial evaluation for the tangent 
function is 

tan(x)=x(CSERIESu„(2x2-l)). 

You have to perform one postprocessing step: 

If y>l then tan(x)=l/tan(x). 

You don't need any range-reduction steps for the 



arcsine function, because all values outside the range 
-l£xsl indicate an error condition. For input argu- 
ments in the range x'sM, you evaluate the arcsine as 
follows: 

asin(x)=x(V^(CSERIES»„(4x^- 1))). 

For input arguments in the range y2<x^sl, you evalu- 
ate the arcsine as follows: 

asin(x)=sign(x)(Tr/2)(V2=25?)(CSERIES;„i„(3-4x2)), 

where sign(x) is the sign of x. 

You use the following trigonometric identity to evalu- 
ate the arc-cosine function: 

acos(x) = 11/2 - asin(x). 

The range-reduction steps for the arctangent func- 
tion are 

• u=x 

• If ABS(x)>l then x=l/x, 

where ABS(x) is the absolute value of x. The 
Chebyshev polynomial evaluation is 

atan(x)=x(CSERIES.i;„(2x2- D). 

The postprocessing steps are 

If u>l then atan(x)=4-(T:/2)-atan(x) 



and 



If u<-l then atan(x)= -(-rr/2)-atan(x). 



The range-reduction steps for the exponentiation 
function are 

• x=x(log2e) 

• N = H-RND(x). 

The Chebyshev polynomial evaluation is 

exp(x)=2'*(CSERIESe,p(2(N-x)-l)). 

Only positive values are valid input arguments for 
the natural-log function; a zero or a negative value 
should be flagged as an error: 

ln(x)=(CSERIESi„(4(mant(x))-3))-i-(expo(x)-.lXln(2)), 

where mant(x) is the mantissa value of x, expo(x) is the 
exponent value of x, and ln(2) is a constant value. 
You perform division operations by evaluating the 
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reciprocal function. For example, you can express the 
division operation C=A/B in its reciprocal form, 
C= A(l/B). By using the Newton-Raphson method, you 
can find an iterative expression for the reciprocal 
function. This expression is 

Xi.,=Xi(2-B(Xi)), 

where xo is the initial divisor reciprocal (seed value) for 
i=0, and Xj is the \th approximation. 

The square-root function also uses the Newton- 
Raphson method. The iterative expression for the 
inverse square-root function is 

Xi.,=0.5(Xi(3.0-Ax,2)). 

You then evaluate the square root of A by the equation 

B=A(Xi.,), 

where A is the input argument, B is the square root of 
A, Xo is the initial approximation (seed value) for i=0, 
and Xi is the itk approximation. 

The principal component of the math-processor sub- 
system described here is the Am29325 floating-point 
processor. The subsystem also contains RAM, bipolar 
PROMs to store coefficients, a subsystem controller, 
and a host interface. The floating-point processor per- 
forms all computations under control of the subsystem 
controller; microcoded programs to perform the func- 
tions you need reside in the subsystem controller's 
PROM. If you wish to modify existing functions or add 
new functions, you merely change the microprogram- 
med PROM. 

The Am2932,5 floating-point processor (Fig 2) pro- 
vides many features that simplify subsystem design. 
The 3-port, 32-bit I/O structure of the Am29325 avoids 
data multiplexing and allows efficient transfer of infor- 
mation. The 32-bit internal registers and data paths 
allow the chip to store the results of intermediate 
calculations for use in subsequent operations, thereby 
avoiding the delays that transfer of these results to and 
from off-chip storage would entail. Many functions don't 
need to send data out of the chip until the final results of 
an operation are ready. 

The floating-point-processor hardware detects excep- 
tional conditions and, rather than compounding the 
error until the end of the calculation, immediately 
notifies the host system. The chip notifies the host by 
means of flags that indicate underflow, overflow, inva- 
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lid operation, and other error conditions. 

Subsystem data storage consists of a high-speed, 
4-port RAM. You can load the data memory from the 
host computer (using DMA), from the floating-point 
processor, or from an integer processor. You'll need to 
process integers during operations such as isolating the 
exponent and mantissa portions of a floating-point 
word. You can have the host processor perform integer 
processing, or you can arrange it so that the math 
subsystem performs the required operations by incor- 
porating an integer processor chip in your design. 

Learn to microprogram the processor 

Two examples of how to implement math functions on 
the Am29325 floating-point processor will give you an 
introduction to the microcoding procedures you'll use in 
the math processor. Recall, that, for a given division 
operation (C=A/B), the Newton-Raphson division algo- 
rithm begins by obtaining the reciprocal of the divisor 
by means of an iterative equation. A single iteration 
requires just three arithmetic operations: 

• multiplication: B(Xi)=u 

• subtraction: 2-u=v 

• multiplication: v(Xi)=Xiti. 

You can microcode this procedure with a 3-instruction 
loop that you repeat until you obtain a sufficiently 
accurate value of Xi.i. You then perform a single multi- 



6-120 



CHAPTER 6 
Articles/Application Notes 



The math processor uses the Newton- 
Raphson method to execute the division 
and square-root functions. 



plication, Axxi^i, to obtain the quotient. 

The conventional way to obtain a seed is to use the 
most significant 16 or so bits of the divisor as a pointer 
into a look-up table in ROM; the contents of the address 
to which the divisor bits point become the seed output, 
which usually has approximately the same number of 
bits. You might think that use of a 16-bit address would 
require a ROM that's 64k words deep, but this is not so. 
In floating-point division, you can reciprocate the expo- 
nent and significand separately, each from its own 
table, and then recombine them. Consequently, for an 
8-bit exponent and the eight most significant bits of the 
significand, you require only two tables, each just 256 
words deep. 

You can also trade ROM word width for execution 
time (ie, the number of iterations); doubling the width 
of the significand stored in ROM will reduce reciprocal 
refinement time by roughly one iteration. Convergence 
is specified by the inequality 2/B>IXoi>0. 

The microcoding for the complete Newton-Raphson 
division is shown in Table 1. The operation requires six 
lines of microcode. In cycle 1, you load the seed into 
register R of the floating-point processor and load the 
divisor into register S. In cycle 2, you multiply the 
contents of registers R and S; the result appears in 
register F. 

In cycle 3, you perform the subtraction, using the 
2-S instruction of the floating-point processor. The 



input for port S comes from register F via the internal 
feedback path. The result of the subtraction appears in 
register F. 

In cycle 4, you perform the second multiplication. 
This operation multiplies the contents of register F (via 
port S) by xi (from register R). The result, Xi^i, replaces 
Xi in register R. In parallel with the multiplication, the 
microsequencer executes a jump back to cycle 2 to 
begin the next iteration. 

Cycle 5 begins after the last iteration of cycles 2 
through 4. In this cycle, you load the dividend (A) into 
register S and multiply it by the contents of register R 
to produce the final result. This result appears in 
register F, from which you can unload it via the F bus 
to local data storage or to the host. 

The second implementation example uses the 
Chebyshev method to perform a sine calculation. In the 
polynomial equation that evaluates the sine function, 

CSERIESsi„=a(,+aix-l-a2X*+a3x'4-a4X*-l-a5x'. 

The range-reduction steps require eight or nine oper- 
ations. Evaluation of the polynomial equation requires 
23 additional operations, including processing of the 
2x^-1 expression. One final operation multiplies the 
result of the polynomial evaluation by x. The sine 
function therefore requires 32 or 33 operations. 

You can, however, save 10 cycles in the evaluation of 
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The floating-point processor hardware de- 
tects exceptional conditions and, rather 
than compounding the error, immediately 
notifies the host system. 

the polynomial equation by applying Horner's Rule, an 
algebraic method for rearranging components in a 
polynomial. The polynomial equation then becomes 

CSERIES,i„=C(C(a6X+ai)x+a3)x+a2)x+ai)x+ao. 

The total number of operations in the sine function then 
decreases to 22 or 23. Evaluation of the rearranged 
polynomial equation is complete in 10 clock cycles. 

In cycle 1, you load x into the S register and as into 
the R register. Multiply these two operands to produce 
ajX X. In cycle 2, you load the result of the multiplication 
into the F register, load ai into the R register, and add 
the contents of the F and R registers to yield 

(a6Xx)+a4. 

In cycle 3, you load the result of the addition into the 
R register; the S register still contains x. Perform RxS 
to obtain 

((asxx)+a4)x. 

Cycles 4 through 10 perform similar addition and 
multiplication operations, progressively using the 
terms as through ao. The final result of evaluating the 
polynomial equation is available in the F register after 
cycle 10. 

The ability to perform both simple and complex math 
functions rapidly is critical in systems that process data 
in real time. You won't yet find many simple, compact 
solutions to this problem on the market. Math-coproc- 
essor ICs are available, but they are still in the low- to 
medium-performance range, and they limit you to a 
microprocessor environment. (Table 2 shows compara- 



tive timings for two floating-point coprocessor chips 
and the Am29325 floating-point processor.) 

You can design and build your own MSI chip, but such 
a product will require much development time and cost, 
and it will probably be large and consume lots of power. 
Another possible approach is to compute the values of 
the math functions you will need and to store these 
values in ROM, but such a look-up-table method is 
adequate only for small amounts of data. At the present 
time, the use of a math subsystem based upon a VLSI 
floating-point processor with a relatively small amount 
of support circuitrj' appears to be the most cost- 
effective solution. EDM 
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OF SINGLE-PRECISION FLOATING-POINT FUNCTIONS 



FLOATING-POINT 
CHIP 


SPEED 

(MHz) 


ADO 

(cSECI 


MULTIPLY 

(„SEC| 


DIVISION 

(^SEC) 


SQUARE ROOT 

O.SECI 


SINE 

(„SEC) 


COSINE 

(kSSC) 


TANGENT 

(uSEC) 


INTEL 80S7' 
MOTOROLA 68881' 
AMD Am29325 


8.0 

i6,sr 

8.0 


12.5 
2.6 
0.125 


18.1 
3.1 
0.12S 


25.4 
3.8 
1.125 


23.3 
N/A 
1.625 


NOTES 
23.0 
2.S75 


NOTES 
23.0 
3.125 


676 

272 
4.750 



NOTES: 

N/A • TIMES NOT AVAILABLE. 

1. TIMES FOR THE INTEL 8087 WERE DERIVED FROM THE INSTRUCTION CLOCK COUNT GIVEN IN THE INTEL DATA PAMPHLET (1984) ALL 
TIMES LISTED ARE WORST CASE. 

2. TIMES FOR THE MOTOROLA MC68881 WERE TAKEN FROM A NEWS ITEM IN ELBCTRONIC PRODUCTS. FEBRUARY 15, 1985. PG 43. 
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Optdmize your 
graphics system 
for 2-D and 3-D 



The design of atrophies system thafs both 
2-dimensional and 3-dimensional poses 
some conflictingi requirements. Tou can rec- 
oncile some of these conflicts, however, 
throt^h careful design of the frame-buffer 
structure, and you can achieve adequate 
speed for 3-D applications by usin^ parallel 
processors for computation-intensive tasks. 



Anoop S Khurana and Olivier Garbe, 
Advanced Micro Devices Inc 

A graphics system that will handle both 2- and 8- 
dimensional applications presents design requirements 
that are at odds with one another. These conflicts arise 
from the fundamental differences in the nature of the 
geometry-, pixel-, and display-processing tasks re- 
quired by the two systems. A system with a micropro- 
grammed architecture can help you avoid the difScul- 
ties you'd encounter in reconcUing these differences. 

You'd use a 2-D graphics system with such graphics 
editors as MacDraw, MacPaint, and Interleaf, or with 
CAE programs such as schematic-capture packages or 
layout editors for pc-board design. You'd need a 3-D 
system, on the other hand, to display 3-D wire-frame 

Reprinted with permission from EDN, Vol, 32 No. 6, March 18, 1 987. Coovrioht 
1987, Reed Publishinfl USA. 



models, to model solids for mechanical design, or to 
produce visually pleasing 3-D pictures for animation. 

One of the m^or differences lies in the size of the 
frame buffer needed, and the speed with which the host 
computer can obtain access to it. Most 2-D systems 
need only eight bits to define a pixel color as one of 256 
simultaneously displayable colors. A 3-D system, on the 
other hand, needs eight bits each for red (R), green (G), 
and blue (B) — a total of 24 bits per pixel. Also, 2-D 
pixel-processing operations require fast access to multi- 
ple pixels during the same frame-buffer cycle. In a 3-D 
system, by contrast, pixel-processing operations (such 
as Gouraud shading) are computation-intensive but 
require access to only one pixel at a time. 

Similarly, geometry-processing operations are more 
arithmetic-intensive in 3-D than in 2-D systems. Fixed- 
point, 32-bit arithmetic provides adequate computa- 
tional power and speed for many 2-D applications, 
whereas 3-D applications need the speed and versatility 
of fast floating-point arithmetic. 

Most of the graphics systems available today, includ- 
ing engineering workstations, are optimized for 2-D 
graphics operations; if they have 3-D capabilities, they 
perform the required processing mainly in software, 
which is slow. To obtain adequate speed, then, serious 
users of 3-D graphics find that they need a separate 
system that's optimized for 3-D graphics, resulting in 
an expensive duplication of hardware and software. 

You can avoid these disadvantages by designing a 
single graphics system that provides all the features 
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A 2-dimensional graphics system can han- 
dle diagrams, but you need S-dimensional 
capability for mechanical modelin£[. 



necessary for both 2-D and 3-D graphics. You'll find a 
microprogrammed architecture ideal for such a system, 
Isecause such an architecture lets you customize the 
data paths and computational resources to a particular 
application and to the performance level that you want. 
It also lets you integrate both fast integer and fast 
floating-point arithmetic capabilities, both of which are 
necessary for complex graphics operations, into a single 
system. 

As an example of such a system, consider the design 
of a graphics peripheral for i conventional minicomput- 
er. This peripheral can act as a bus master on the host's 
system bus, but it need not do so. The application 
program runs on the host computer and generates a 
display list, defining the image, which the CPU passes 
to the graphics peripheral via a DMA channel (or by 
any other appropriate means). The graphics peripheral 
processes this display list to generate the image. (The 



steps that convert a display list to an image on .the 
screen are collectively referred to as the "graphics 
pipehne"; see box, "From object to image: the graphics 
pipeline.") The three main functional bloclis of the 
system are the communications and display-list han- 
dler; an update processor that performs geometry and 
pixel processing; and a display controller (Fig I). 

A conventional, general-purpose, 16- or 32-bit m,P, 
which has its own memory and DMA channel, receives 
and executes commands issued by the host. This com- 
munications processor can directly execute some host 
commands, such as Load Display-List. Other com- 
mands, such as Render Display-List, involve the rest of 
the graphics system; the communications processor 
analyzes these commands and dispatches appropriate 
commands to the update processor, using a message- 
based protocol and a fast, dual-access memory block 
that serves as a mailbox. 



From object to image: the graphics pipeline 



The graphics pipeline is the se- 
quence of operations that trans- 
lates the user's description of a 
scene into a viewable image. The 
four stages in this process are 
display-list handling, geometry 
processing, pixel processing, and 
display control. 



The display-list handler helps 
the user or the application pro- 
gram decompose objects to be 
depicted into a display list. The 
display list is usually hierarchi- 
cal, and it embodies the struc- 
ture inherent in the object being 
modeled. Leaf nodes in the hier- 



archy are drawing primitives 
provided by the graphics 
system. 

The geometry processor per- 
forms viewing- and perspective- 
transformation operations on the 
display list, and it clips objects 
against the boundaries of the 
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Fig 1—A graphics sul>sysiein it idtally an intelligent peripheral thai accepts a display list from the host computer and comrris the digital 
representation of an image into a standard video Higiial that creates a screen display. 



The dual ports of the mailbox allow the update 
processor to read a command while the communications 
processor is sending a subsequent command. Sema- 
phores, also located in the mailbox RAM, govern both 
command chaining and the allocation of memory to 
message buffers. 

The microprogrammed update processor executes all 



commands that are related to geometry or pixel pro- 
cessing. Such operations may update the pixel data in 
the frame buffer, or they may pass a message back to 
the communications processor. 

The frame buffer uses video RAM (VRAM) ICs, both 
to maximize bandwidth and to minimize the quantity of 
hardware needed for refreshing the image. The frame- 



viewing volume. You can decom- 
pose the complex primitives used 
by the geometry processor, such 
as patches or cubic curves, into 
simpler primitives, such as poly- 
gons or lines. 

The pixel processor physically 
writes all the pixels affected by 
a primitive into their correct lo- 
cations in the frame buffer. It 
also performs all operations, 
such as pixel-block transfers, 
that require pixels to be read 
from or written to the frame 
buffer. 

The display controller con- 
verts the pixel values stored in 
the frame buffer into a standard 
video signal. This video signal, 
when transmitted to a suitable 
monitor, builds the desired 
image on the screen. 

A single, general-purpose pro- 
cessor, such as the Intel 80286, 
along with the 80287 numeric co- 
processor, can perform all the 
operations in the graphics pipe- 



line sequentially. In such a sys- 
tem, the main processor writes 
the final value of each pixel to 
the frame buffer, which forms 
part of the address space of the 
main processor. This configura- 
tion is relatively slow, however, 
and the speed may be inade- 
quate for 3-D applications. 

You can achieve improved per- 
formance by using specialized 
VLSI peripheral devices, such 
as the Am95C60 Quad Pixel Da- 
taflow Manager, to speed some 
of the operations in the graphics 
pipeline. Most current graphics 
peripherals relieve the main 
processor of most of the pixel- 
processing tasks. Typical func- 
tions performed by such periph- 
erals are line drawing, polygon 
filling, and block transfer of pix- 
els. Because these tasks are rel- 
atively standard and are well 
suited to implementation in 
high-performance silicon, graph- 
ics peripherals yield a substan- 



tial improvement in system per- 
formance. You can achieve a 
similar improvement by using 
high-performance floating-point 
processors to speed the compu- 
tation-intensive geometry-proc- 
essing tasks. 

For even higher performance 
and functionality, you should 
consider the use of multiproces- 
sing systems that provide one or 
more processors for each stage 
in the graphics pipeline. Two 
factors contribute to the im- 
provement in performance that 
such systems yield. First, be- 
cause most graphics operations 
are vector operations, the con- 
current performance of several 
parts of a task can yield a speed 
increase that's proportional to 
the number of processors avail- 
able. Second, you can fine-tune 
the system by customizing it for 
highest performance in just 
those operations that the appli- 
cations require. 
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A microprogrammed architecture lets you 
customize the resources of the system to the 
problem you're trying to solve. 



buffer controller provides all the signals needed for 
reading, writing, and refreshing the VRAMs, and for 
performing all video-refresh functions. 

You'll need to organize the structure of the frame 
buffer carefully to make the most efficient use of the 
available storage. As noted, for 2-D displays you need 
only eight bits per pixel, which allows you to display the 
pixel in one of 256 colors. For 3-D displays, you need at 
least 24 bits per pixel {eight each for the R, G, and B 
channels); you may also need, for each pixel, an addi- 
tional eight bits for the alpha channel and 16 or 32 bits 
for the Z buffer (a maximum of 64 bits/pixel). 

You can reduce the total number of bits per pixel by 
mapping the Z buffer into a portion of the frame buffer. 
For example, in a 2k-pixelx Ik-line buffer, you could 
map a Ikx Ik-pixel screen into the first Ik pixels of each 
line and the Z buffer into the second Ik pixels. Conse- 
quently, you could access the Z value of a pixel by 
adding an offset of 1024 to the pixel address. You would 
need two memory cycles to access both the RGB and the 
Z values of the pixel. This structure, however, has the 
great advantage that no bits are irrevocably dedicated 
to the Z buffer. If you don't need a Z buffer, this 
memory becomes available for general use. 

You'll still have to resolve the discrepancy between 
the eight bits/pixel needed for 2-D and the 24 bits/pixel 
needed for 3-D. Your first thought might be to allocate a 
32-bit memory word for each pixel, but then'you'd be 
wasting 24 bits in 2-D operations. A better solution is to 
allow each 32-bit word to be treated as four adjacent 
8-bit pixels in 2-D. You could then reorganize a 
2k X Ikx 32-bit memory as a frame buffer of 8kxlkx8 
bits. This organization allows you to store one 3-D 
screen with a resolution of 1024 pixels x 1024 lines x 32 
planes, or several 2-D screens at once. 

The frame buffer in our example consists of 64k x 4- 
bit VRAMs and uses the shifter port of each VRAM for 
video refreshing; the update processor therefore has 
virtually unlimited access to the frame buffer. It's 
possible to organize each VRAM as a 256x256x4-bit 
square area of memory; using this area as a building 
block, you can create a 2k x Ikx 4-bit memory array 
having four rows and eight columns (Fig 2). If you want 
to extend the depth of the array to 32 bits/pixel, you'll 
need eight VRAMs in each element (called a bank) of 
the array. 

The video display controller (VDC) provides com- 
plete control of the frame buffer, both for update 
operations and for video-refresh operations. In re- 
sponse to a read or write memory-cycle request from 



the update processor, the V DC ge n erate s the appropri- 
ate VRAM-control signals (RAS, CAS, etc). If a dy- 
namic-RAM refresh cycle or a transfer cycle for video 
refresh is already in progress, however, the VDC 
delays execution of the update cycle until the higher- 
priority cycle is finished. 

Because each access to the frame buffer reads or 
writes a 32-bit word, the 2kxlkx32-bit frame buffer 
requires 21 address lines, of which 11 define the X 
address and the other 10 define the Y address within 
the array. In the 3-D 32-bit/pixel mode, each 32-bit 
word in the frame buffer represents one pi.xel. 

In the 2-D 8-bit/pixel mode, each 32-bit word repre- 
sents four pixels. "The 18 most significant address bits 
select the 8-bit row address, the 8-bit column address, 
and RAS strobe signals. Decoding the three least 
significant bits yields a decode signal that selects one of 
eight adjacent pixels. 

The capacitive loading imposed by the VRAMs makes 
it necessary to buffer the address and control outputs of 
the display controller. To reduce skew between signals, 
and thereby achieve a s horter mem ory-cycle time, you 
can buffer the address, RAS, CAS, and XF/G signals 
within a single IC package, such as the Am2976 11 -bit 
dynamic memory driver used in this example. 

Select one of eight pixels 

Each of the eight rows in the frame memory receives 
a separate RAS signal. You can therefore connect to a 
common 32-bit bus the data ports of all four banks of 
VRAMs within a column. Each memory cycle now gives 
access to eight pixels, one from each column. The 
update processor operates on only 32 bits at a time, 
however, so you'll need a mechanism to select just one of 
the eight available words. 

You can perform this 8:1 multiplexing quite simply by 
decoding t he th ree least significant address bits to 
obtain the CAS signal. As a res ult, o nly one bank in 
memory receives both RAS and CAS. Consequently, 
you can tie together the outputs of all 32 banks in 
memory, but only the selected bank will drive the bus. 
To access eight sequential pixels, then, you'd need eight 
memory cycles. 

There's another way to perform the multiplexing, 
however — one that gives the update processor very 
rapid random access to any or all of the eight adjacent 
pixels addressed in a single memory cycle. This method 
requires eight 32-bit, bidirectional, bus-interface regis- 
ters. You connect the eight 32-bit words, accessed in 
parallel from the memory, independently to one port of 
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Fig 2—ThU frame buffer it organized as 2k pixels xlk lines x32 bits. Three-dimensional applications can read or write eight adjacent pijrels 
at one time. For 2-D applications, each S2-bit word represents four 8-bit pixels. 
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A microprogrammed graphics system acts 
as a peripheral on the host computer's sys- 
tem bus. 



these registers. To the other port you tie corresponding 
bits of each register together to form a single 32-bit bus 
that leads to the update processor. You then perform 
the 8:1 multiplexing by controlling the output-enable 
signals of the registers. 

The update processor regards the registers as inde- 
pendent 8-pixel input and output buffers. A memory- 
read operation fills the input buffer, and the update 
processor can fetch any or all of the eight pixels much 
more quickly than if a separate memory cycle were 
required for each one. You can also provide two differ- 
ent write modes. In the first mode, the update pro- 
cessor writes just one pixel to the appropriate place in 
memory. In the second mode, the update processor fills 
all eight registers, and the memory cycle writes their 
contents to eight different pixels simultaneously. 

Refreshing the video display is easy when the display 



memory consists of VRAMs. At every vertical-sync 
(Vsync) pulse, the display controller resets an internal 
video-refresh counter to the address of the upper-left 
corner of the screen. At every horizontal-sync (Hsync) 
pulse, the controller initiates a transfer cycle that 
transfers data for the next scan line into the VRAMs' 
shift registers and then increments its internal address 
counter to point to the start of the data for the next 
line. You can perform panning and scrolling simply by 
changing the address held in the controller's top-of- 
frame register. 

Given that there are eight memory banks per row , 
and that each VRAM is capable of shifting at a clock 
speed of 25 MHz, a total bandwidth of 200M pixels/sec 
is possible in 3-D mode. In 2-D mode, the available 
bandwidth becomes 800M pixels/sec. The maximum 
pixel bandwidth is therefore limited mainly by the 
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characteristics of the shift registers and the associated 
D/A converter, not by those of the memory. 

In 32-bit/pixel mode, strobe signals generated by the 
video clock generator — in this example, an Am8158 — 
load into the video shift registers the eight sequential 
32-bit pixels that are in parallel on the video bus (Fig 
3). The video shift registers consist of 16 dual, 8-bit, 
parallel-in, serial-out ECL shift-register ICs. These 
ICs produce serial bit streams of the R, G, and B values 
of each pixel and forward these bit streams to a triple 
8-bit D/A converter. 

In 8-bit/pi.xel mode, the 32 bits that appear at the R, 
G, and B outputs of the shift registers actually repre- 
sent four pixels. Four 4-bit ECL shift registers convert 
the 32-bit data into four 8-bit pixels for use by the 
Am8151 ECL color palette. To change from one mode to 
the other, you need only make the appropriate modifi- 
cations to the Shift and Load signals to the shift 
registers. ■ 

The Am8158 generates the pi.xel clock pulse and some 
of the Shift and Load signals used by the shift regis- 
ters. This IC also generates the Vsync, Hsync, and 
Blank pulses. The display controller uses these signals 
to initiate VRAM transfer cycles, and the D/A convert- 
ers use them to force the video signals to the appropri- 
ate sync or blank levels. You can program all the 
important parameters of these signals using registers 
contained in the Am8158. 

The update processor is microprogrammed 

The update processor performs all pi.xel- and geome- 
try-processing functions for both 2-D and 3-D graphics. 
These functions require powerful and versatile data- 
transfer capability coupled with fast integer and float- 
ing-point arithmetic. Implementing the update pro- 
cessor as a microprogrammed subsystem allows you to 
achieve the high performance that you need. 

The major functional blocks and buses of the update 
processor are shown in Fig 4. The main data path in this 
example consists of the Am29332 integer ALU, the 
Am29323 integer multiplier, and the vector floating- 
point arithmetic unit, which consists of two Am29325 
ICs. Each of these units accepts data from two common 
32-bit input buses and places its results on one common 
32-bit output bus (the main data bus). 

An Am29334 register file provides storage for fre- 
quently accessed data. Its read ports supply data to the 
arithmetic unit's input buses. It also has two write 
ports, one of which accepts data from the main data 
bus, while the other transfers the result of an ALU 
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operation back to the register file without using the 
main data bus. The system timing is such that the ALU 
can fetch two operands from the register file, process 
them, and write the result back to the register file 
within a single microcycle. 

The update processor addresses 64k 32-bit words of 
high-speed local data memory, which consists of static 
RAM. An Am2131 dual-port message-buffer IC occu- 
pies Ik words of the 64k-word address space. To allow 
the main ALU to process video data at maximum 
efficiency, an auxiliary Am29C101 16-bit ALU performs 
all local-memory address computation; the outputs of 
this ALU are captured in a 16-bit address register. 
Random accesses to local memory therefore take two 
microcycles — one to compute and latch the address, and 
another to access the RAM. During consecutive memo- 
ry accesses, however, next-word computation overlaps 
the current RAM access, so that the second and subse- 
quent memory accesses are completed in a single micro- 
cycle. 

The frame-buffer-address generator consists of pre- 
settable up/down counters (an 11-bit counter for the X 
address and a 10-bit counter for the Y address). The 
sequencer loads these counters via the main data bus. 
Although the main ALU is primarily responsible for 
generating frame-buffer addresses, use of the counters 
speeds the critical loops in curve drawing and other 
pixel-processing functions. 

The update processor is configured with a single level 
of pipelining, so that next-address computation over- 
laps execution of the current microinstruction. The 
Am29331 sequencer computes the address of the next 
instruction in response to its instruction inputs, and it 
places the result on its Y output bus. For access to 
sequential microcode addresses, this result is simply 
the contents of the program counter. The sequencer 
uses an internal stack to store count values for nested 
loops and return addresses for calls to microcode sub- 
routines. 

To execute a jump to an address defined by the 
microcode, the sequencer connects the address section 
of the microinstruction word back into its program 
counter via the A bus. To allow the computation of jump 
addresses at run time, and to allow external examina- 
tion of the sequencer's stack and stack pointer, the D 
bus connects to the main system bus. 

An internal condition-code multiplexer, controlled by 
microcode, selects and enables one of the condition 
inputs of the sequencer; the sequencer can then test 
that condition and jump according to the state of the 
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The organization of the frame buffer is the 
key to resolving conflicts between 2-D and 
3-D requirements. 



selected input. For testing as many as four conditions 
simultaneously, a PAL device accepts all the signals 
that need to be tested simultaneously and encodes them 
into four fields of four bits each. A base address is 
assigned to each field, and the state of the field defines 
one of 16 sequential locations as an offset from the base 
address. The sequencer can then examine one of these 
fields and jump to the location defined by the state of 
that field. You can use this capability to advantage in a 
line-clipping algorithm. 

In the 2-D mode, one of the most important pixel- 
processing operations is the movement of a rectangular 
block of pixels from one area of the frame buffer to 
another. This process, also known as BitBlt, may also 
require the execution of a logical operation during the 
transfer. The update processor transfers data one row 
at a time from the source block to the destination block. 



Within a row, the processor may transfer data either 
left to right or right to left. The sole reason for 
including the feature that provides fast access to eight 
pixels in the frame buffer is to speed block transfer. In 
the 32-bit/pixel mode, the algorithm that transfers one 
row of the source block to the corresponding row in the 
destination block has four steps, as illustrated in Fig 5a 
and described as follows; 

• Read memory with X=24. This operation trans- 
fers pixels 24 through 31 into the frame buffer's read 
registers. Next, read pixels 31 and 32 into the register 
file. Then read memory again with X=32. Read five 
pixels (32 through 36) into the register buffer. You have 
now transferred the first seven pixels from the source 
region into the register file (there are only seven valid 
pixels in the first destination read cycle). 

• Read memory with X=96. This operation trans- 
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fers seven valid destination pixels into the frame buf- 
fer's registers. 

• Read each valid destination pixel, one at a time, 
and perform any required logical operation with the 
corresponding source pixel in the register file. Write 
the resulting pixel back into the frame buffer's write 
registers. Copy each unread destination pixel from the 
input register to the output register. 

• Write the eight destination pixels in the output 
registers back to memory. Repeat the sequence until 
you have transferred the entire row. 

Assuming that a memory-read cycle takes 300 nsec 
and that each frame-buffer read or write operation 
takes 100 nsec, the total transfer time is 500 nsec/pixel. 
Using this algorithm, an average covering all possible 
alignments of source and destination turns out to be 
approximately 600 nsec/pixel. This time is a substantial 
improvement over the time of 1200 nsec/pixel for the 
case in which each memory cycle accesses a single pixel, 
and it's an acceptable data-transfer speed for 32-bit 
pixels. 

In the 8-bit/pixel mode, the block-transfer algorithm 
must take into account different alignments of the 
source and destination within a 32-bit word, and it 
requires a modification of the procedure. The modified 



algorithm, illustrated in Fig 5b, is as follows: 

• Read source words 1 and 2 simultaneously from 
both output ports of the register file. Using the 
Am29332 funnel shifter, extract four bytes aligned with 
the destination, and write this 32-bit word back to a 
temporary location in the register file. In the example 
shovm, you need to extract the last three pixels of word 
1 and pixel S2 from word 2. 

• Read this aligned source location, using one regis- 
ter-file port. Read the destination pixel from the frame 
buffer via the main bus into the second register-file 
port. 

• Perform the logical operation on the aligned- 
source and destination pixels, using the mask generated 
internally by the ALU; doing so leaves the first pixel 
unchanged by the logical operation. Write the result, 
which appears at the ALU's outputs, back to the frame 
buffer's input registers at the end of the cycle. 

Step 3 of the algorithm now takes three microcyeles 
per word instead of two, and it changes the average 
transfer time to just over 600 nsec per word. Because 
each word contains four pixels, the average pixel- 
transfer time is 600-h4= 150 nsec/pixel. This pixel-trans- 
fer rate allows an entire Ikxlk-pixel screen to be 
updated in 150 msec, or about 10 frame times, and is 
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The update processor needs fast access to 
several pixels at a time in the frame 
buffer. 



sufficient for displaying text and manipulating 
windows. 

It's not difficult to implement line- and circle-drawing 
algorithms, such as those of Bresenham, in microcode. 
The inner loop of Bresenham's line-drawing algorithm 
will require three microcycles. Because this time is 
equal to the time needed to access a pixel in the frame 
buffer, you can plot pixels at the pixel-access speed of 
the memory. However, because this algorithm does not 
profit from the fast access to sequential pixels, the 
plotting speed will be about the same in both the 
32-bit/pixel and the 8-bit/pixel modes. The inner loop of 
Bresenham's circle-drawing algorithm will require four 
microcycles, and because each iteration through the 
loop generates eight points that must be plotted in 
separate memory cycles, circles too are drawn at the 



rate of about one pixel in every frame-buffer access 
time. 

Typical pixel- and geometry-processing operations in 
a 3-D system are computation-intensive and require 
that you carefully consider the design of the arithmetic 
unit. Integer arithmetic, although fast, is unsuitable for 
these graphics operations. Fixed-point arithmetic has 
disadvantages as well. Although you can readily per- 
form most pixel-processing functions using 32-bit fixed- 
point arithmetic, fixed-point geometry-processing op- 
erations require time-consuming pre- and postscaling 
operations. For this reason, floating-point operations 
are easier to develop and are more general in character. 
Furthermore, there are now many inexpensive floating- 
point chips, which are almo.st as fast as integer units 
and provide all the computation power you need. 
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In a graphics system, most of the arithmetic compu- 
tations are vector operations, because points, plane- 
equations, transformation matrices, and other common 
data structures are all vectors. For example, you can 
represent a point in 3-D space, in homogeneous form, as 
the vector (x y z w). Although a single processor can 
perform vector operations sequentially, a multiple- 
processor system that uses four ICs (in this example, 
Am29325s) is much faster. If you can distribute the 
computation tasks among the four processors in such a 
way that you keep each processor busy all of the time, 
you can expect to achieve four times the performance of 
a single processor. 

Fortunately, it's quite easy to distribute the simple 
vector operations that are useful in graphics. For exam- 
ple, perspective division on a point (x y w z) in homoge- 
neous coordinates yields (x/w y/w z/w 1). Consequently, 
you can perform these divisions in parallel on four 
different processors, and you can arrange for algo- 
rithms that do not map onto such an architecture to run 
(though more slowly) on a single processor as a se- 
quence of scalar operations. Furthermore, the fact that 
all processors perform the same operation (division, in 
this example) at the same time (but on different data) 
suggests that you should design the floating-point unit 
as a single-instruction, multiple-data (SIMD) machine, 
whose processors share a common instruction bus. 

You can see the overall structure of a 4-processor 
SIMD floating-point unit in Fig 6. Each section consists 
of a floating-point processor, a register file, and a seed 
ROM (Fig 7). In each section, a 64-word area of the 
stack constitutes the register file, and you can address 
data in the register file with a 6-bit negative displace- 
ment from the stack pointer. The microcode word 
therefore contains four 6-bit fields to specify the ad- 
dresses of the four ports on the register file. The 
stack-addressing capability allows microcode subrou- 
tines to be completely general in character, and if you 
first load the stack pointer with zero, you can use the 
microcode-word displacement fields to specify absolute 
addresses. 

The seven instruction bits of the main microcode 
word, when decoded, provide all the output-enable and 
multiplexer-select signals needed to reflect all possible 
arithmetic-operation and source/destination combina- 
tions. .Twenty-four bits specify the addresses for the 
four ports of the register file, two bits control write 
operations on the Da and Db ports of the register file, 
and one bit switches the source-select multiplexer lo- 
cated at the register file's Da input. Two additional bits 
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TABLE 1-TRANSFORMATION 
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determine whether the stack pointer is to be left 
unchanged, incremented, decremented, or loaded from 
the data bus. 

A data-access microcycle consists of three time slots. 
In the first slot, the address hardware computes regis- 
ter-file addresses by adding the displacement specified 
in the microcode word to the current contents of the 
stack pointer. In the second slot, data is written into 
the register file. In the last slot, data required for the 
next execution cycle is read from the register file. 

The pipelined structure of the floating-point unit 
allows the overlapping of arithmetic operations with 
operations that access data from the register file. As a 
rule, the floating-point unit must access data from the 
register file one microcycle before using that data in an 
arithmetic operation. In many cases, however, the data 
needed for the next operation is already held in the 
Am29325's internal registers, so that a register-access 
cycle is unnecessary. Furthermore, most graphics op- 
erations allow execution cycles to overlap data-access 
cycles in a similar manner. Consequently, the effective 
throughput of the floating-point unit remains close to 
one operation per microcycle. 

Guidelines for coding typical operations 

As an example of how you can distribute portions of 
an operation among the four processors, consider the 
transformation of a 3-D point in homogeneous coordi- 
nates, using a x4 matrix. The first step is to broadcast 
all four coordinates of the point to be transformed, and 
to write them into the register files of all four sections 
of the floating-point unit simultaneously. Because the 
register file also acts as the matrix stack, the transfor- 
mation matrix is already established in the floating- 
point unit. You then distribute the transformation 
matrix among the four sections, storing only one col- 
umn of the matrix in each section. 

Assume that the point to be transformed is on top of 
the stack at [ST(0) ST(1) ST(2) ST(3)], and that the 
matrix column is at [ST(4) ST(5) ST(6) ST(7)], where 
ST(«) refers to the data n words down from the current 
stack pointer. You perform the transformation by com- 
puting the dot product of the point and a column of the 
transformation matrix. You can now compute, in paral- 
lel, the four dot products needed to transform each 
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The update processor is configured with a 
single level of pipelining, so that next-ad- 
dress computation overlaps execution of the 
current microinstruction. 



component of the vector, one in each section of the 
floating-point unit. The entire transformation can com- 
plete within nine microcycles (Table 1). 

You can use the same approach to perform matrix- 
matrix multiplication. In this case, assume that the 
current transformation is on top of the stacli, with one 
column in each section. You can now treat a row of the 
new matrix as a point and transform it by the matrix 
held on ton of the stack to yield a row of the trans- 



formed matrix. You repeat this procedure four times 
(once for each row) to obtain the complete result. A 
matrix-matrix multiplication therefore takes 36 micro- 
cycles. 

You can also perform parallel interpolation, using 
forward differences, when drawing cubic curves such as 
splines and Bezier curves. In this case, each iteration 
requires three addition operations, and because each 
component of the vector requires an identical computa- 
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tion, you can perform the four computations in parallel 
in the four sections. Consequently, you can compute a 
new point every four microcycles. In the computation 
shown below, Dx, D^s, and Dax are the first-, second-, 
and third-order forward differences for the X coordi- 
nate: 

[X Dx D,x D:,x]=[X Dx D,x D,x]+[Dx D,x Dsx 0] 

[Y Dv Djv D:iv]=[Y Dv D.V D:,v]+[Dv Dov D^v 0] 

[X Dz Da D:,d=[Z D, Da D„]+[D2 D^ D-^ 0] 

Perspective division requires a division operation, 
and the normalization of an interpolated vector, in the 
inner loop of Phong shading, requires square-root oper- 
ations. The Ani29325 does not perform division and 
square roots directly, however. Instead, it uses New- 
ton-Raphson iteration to obtain the corresponding re- 
sults. The seed ROM provides the seed (or first approxi- 
mation) to start the iteration procedure. Each iteration 
requires three microcycles for division and five micro- 
cycles for square roots. Refining the seed to approxi- 
mately single-precision accuracy requires another three 
microcycles. Consequently, each division operation re- 
quires a total of ten microcycles, and each square-root 
operation requires sixteen microcycles. Furthermore, 
because each processor in the floating-point unit has its 
own seed table, four such computations can proceed in 
parallel. 
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Variable-width FIFO buffer 
sequences large data words 

Tim Olson 

Advanced Micro Devices Inc., 901 Thompson Pi,. P.O. Box 3453, Sunnyvaie. CA 94088; (408) 732-2400. 



fast systems gain 
from a cascadable 
device supporting 
everytf}ing from 
instruction pipe- 
lines to periptieral 
ttost adapters. 



First-in, first-out (FIFO) buffers are a popular 
means of matching different data rates in large digi- 
tal systems. I/O controllers for character-oriented 
devices like terminals, for example, usually return 
or receive one 8-bit byte on a slow but regular basis. 
In contrast, block-oriented devices, such as high- 
speed disks, must move large chunks of data from 
peripherals to the host bus with great speed. 

The demand for larger, denser data-processing 
systems has spurred the development of FIFO buff- 
ers with deeper memory 
but unchanged width. 
^^^^^^^^^^^^^ Cascading these buffers 
horizontally or vertical- 
ly is still the most com- 
mon and cost efficient 
method of expanding 
both the width and 
depth of a data queue. 

Even this solution 

has shortcomings. 
FIFO buffers usually 
link devices of like width but do not possess the req- 
uisite logic to cope with, say, transferring data be- 
tween a 32-bit-wide memory, 16- or 32-bit data bus- 
es, and an 8-bit peripheral bus. To further 
complicate matters, some of the newer variable- 
width instruction architectures must buffer in- 
struction words varying in width from 8 to 128 bits 
at any particular cycle. 

In short, as both synchronous and asynchronous 
systems push toward larger or disparate data 
widths, it becomes more difficult to cascade with 
typical 8- and 9-bit-wide FIFO buffers in a rudi- 
mentary fashion. Designers are seeking an efficient 
solution for matching data widths as well as data 
rates. 

One of the best devices for such matching is the 
Am29338 Byte Queue FIFO buffer. The general- 
purpose, 32-bit-wide buffer is organized as four 
dual-ported RAMs, each 9 bits (1 byte plus parity) 
wide and 32 bytes deep (Fig. la). Each RAM sec- 
tion haS its own queue (load) and dequeue (unload) 

■'Reprinted with permission from Electronic Design. 
Vol. 35 No. 14, July 11. 1987. Copyrigtit 1987 
Hayden Publishing Co., Inc." 



pointers (Fig. lb) and supplies byte-wise (that is, 
byte-by-byte) parity checking at the buffer's input 
and output. A Byte Count output shows the current 
number of bytes in the queue. The RAMs are orga- 
nized so that a variable number of bytes can be 
queued or dequeued at any cycle. The device can 
queue or dequeue from zero to four 8-bit tyes of 
data in one 80-ns cycle. Ultimately, this feature can 
be used to queue data at one width and dequeue it at 
another. For example, two 1 6-bit half words may be 
queued sequentially and dequeued as one 32-bit 
word. In addition, the Am29338 can be cascaded 
horizontally to release up to 1 6 data bytes (128 bits) 
per cycle. 

The Am29338 also addresses the problem of byte 
ordering, a side effect of the evolution of memory 
word widths form 8 to 1 6 to 32 bits. Byte ordering is 
simply the order in which bytes appear in a word. 
The Am29338 performs byte swapping to effect 
any type of byte-ordering scheme. Two signals, for 
example, allow bytes to be swapped within 16-bit 
half words and 32-bit half words, respectively. To- 
gether, they make possible four separate byte order- 
ings(Fig. 2). 

Like the rest of the Am29300 family of 32-bit mi- 
croprogrammable building blocks, the Am29338 is 
implemented in ECL (packaged in a 120-pin pin- 
grid-array) but is interfaced with TTL-level de- 
vices. Because it is RAM-based, the buffer has an 
almost zero fall-through delay, suiting it to appli- 
caitons where data must be immediately available 
after a queueing operation. 

This feature best suit systems with variable data 
widths, especially instruction-prefetching pipe- 
lines, I/O peripheral buffers, and hardware 
mailboxes. 

AN INSTRUCTION-PREfEICH QUEUE 

Instruction-prefetch queues, of course, separate 
instruction fetching from instruction execution for 
parallel execution of the two tasks. Between jumps 
from one operation to the other, a sequential in- 
struction stream is fetched from memory and 

Electronic Design • Jur^e 11. 1987 



6-136 



CHAPTER 6 
Articies/Application Notes 



DESIGN APPUCATION ■ Variable-width FIFO buffer 




Full and 

A-Full 

generate 



Bytes in 
queue 
logic 



Empty and 
A-Empty 
generate 



Memory 

slice 
logic 3 




Memofy 

slice 
logic 2 



Dequeue 
rotate 
logic 



DOENo- 
BOOo.3 o- 




Memory 
slice 
logic 1 



■ Memory 
slice 
logic 




Parity 
ctieck 



i. The Ann29338 Byte Queue from AMD is a general- 
purpose, 32-bit FIFO buffer with four 8-by-32-bit RAM 
memory stacks. It works ir> either the synchronous or 
asynchronous mode, con transmit data blocks, and 
performs error checking at both input and output. 
Up to four bytes can be queued or dequeued in 
one cycle (a). Each stack has its own pointers: 
queue and dequeue logic enabling variable-width 
data to enter and leave the FIFO buffer (b). 
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placed in the prefetch queue. This occurs independently 
of the rate at which the instructions are decoded and exe- 
cuted. Because many computer architectures work with 
variable-length instructions, the Am29338, which can re- 
lease data of different widths, greatly simplifies prefetch- 
queue designs. Fixed-width words can be queued from 
memory while variable-length instructions are dequeued. 
The Am29338 buffer can function as an instruction- 
prefetch queue, where it is synchronized with a separate 
instruction-fetch unit (Fig. 3). In operation, sequential 
32-bit memory locations are fetched by the instruction- 
fetch unit and are stacked in the byte queue. Each time 
the CPU needs an instruction, it takes the next bytes in 
the byte queue rather than addressing main memory. The 
CPU can determine the instruction length from the first 
byte of the instruction and updates the dequeue pointer in 
the byte queue; that is, it tells the byte queue which bytes 
it wants to see. The instruction length is determined by 
the 4-bit word on the Bytes Dequeued (BDQ) lines while 
the Dequeue Clock (DQCLK) line releases the bytes 
from the queue. If a jump in the instruction sequence (the 
program) occurs, the instruction-fetch unit must flush 
the byte queue by asserting the Reset line and issuing a 
new instruction address. 

EXECUTING SMALL LOOPS 

The Byte Count (CNT) indicator can serve as a tool to 
limit the buffer's depth. For instance, jump or branch in- 
structions usually account foi- about 20% of a typical in- 
struction mix. When a jump occurs, instructions stored in 
the instruction-prefetch queue are discarded. To limit in- 
struction-prefetching operations and conserve memory 
bandwidth, the user can sound an alarm when the fetch 
buffer's depth surpasses five or six instructions. 

Many operations, however, can be executed with small 
loops, which fit entirely in the prefetch queue and can be 
controlle d with the assertion of the retransmit lines 
(RXMIT) and with a small amount of external hardware. 
The Am29338 buffer can rapidly retransmit stored block 
data without requeuing from main memory, assuming 
that 128 bytes or less have been queued since the last as- 
sert ion of a R eset command. This is done by first bringing 
the RXMIT Une low. When this happens, the chip's inter- 
nal dequeue pointers are directed to the first RAM loca- 
tion, and the internal queue pointers are not reset. The 
data in the locations between the old queue poi nters and 
the new dequeue pointers can then be unloaded. RXMIT 
is useful for redundant instruction sequences because the 
CPU can run faster without having to refetch instructions 
from memory or cache. 

New applications open the door for instructions far in 
excess of 32 bits, particularly in systems that use large, 
variable-length instructions spanning many bytes. To 
meet this challenge in the synchronous mode, up to four 
Am29338s may be cascaded horizontally to free up to 16 



consecutive bytes (one 128-bit word) for dequeueing in 
one cycle (Fig. 4a). Because each cascaded part is con- 
nected to a common 32-bit input bus, each chip holds the 
same information (Fig. 4b). When the Reset (or RXMIT) 
Une is asserted, however, the internal dequeue pointers 
are offset by the value programmed on the chip's position 
inputs, POS. 

Another frequent task for first-in, first-out buffers is as 
a straightforward I/O buffer. Many processor-memory 
systems have expanded their word length from 8 to 32 
bits, though the peripheral-controller chips have for the 
most part remained at 8 bits. The Am29338 buffer sup- 
plies a buffered path between peripherals and memory 
while making the necessary conversion from one word 
size to another. 

MESSAGE IN THE MAIL 

A communication mailbox usually serves to link two 
or more loosely coupled devices in a multiprogramming 
system. With the help of a first-in, first-out buffer, mes- 
sages from one device to another are queued in the mail- 
box. If the mailbox happens to be full, the sending process 
blocks data transfer until the mailbox has a slot free. If the 
mailbox is empty, the receiving process is blocked until 
the mailbox receives a message from the sending end. 
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2. The data stacks make possible tour ditferent com- 
binations of byte swapping. As a result, data can be 
queued at one width and dequeued at another. 
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3. The FIFO buffer can function as an Instruction-pre- 
fetch queue by coupling it with a sepatote instruc- 
tion-fetch unit. The CPU runs faster by reading repeti- 
tive Instruction loops from the byte queue without 
addressing main memory. 
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Otherwise, the sending and receiving processes run 
concurrently. 

When devices are run on separate processors in a mul- 
tiprocessor system, a hardware mailbox is needed. The 
',;Am29338 can help create such mailboxes (Fig. 5), serving 
to transfer variable-length messages from one processor 
to another. 

In this design example, two AmPAL16R4 program- 
mable-logic arrays serve as the interface to the Am29338, 
one each for the sending and receiving processors. The ar- 
rays serve as a conduit to examine the status of the FIFO 
buffer and also enable a programmable interrupt. In oper- 
ation, the processor wishing to send a message to the 
mailbox calls a special operating-system routine. This 
routine first reads the status of the mailbox; if it is not full, 
the message is written. Then the routine returns to the 
calhng process. If the mailbox is full, the operating-sys- 
tem routine blocks the calhng process and enables inter- 
rupts from the mailbox. When a slot becomes available, 
the sending processor is interrupted. The interrupt rou- 
tine sends the message, disables interrupts from the mail- 
box, and blocks the sending process. The receiving side of 
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4. Up to four FIFO buffers can be horizontally cas- 
caded to support large word-width computer appli- 
cations. Up to four devices can create one 128-bit 
word or a combination of 8-bit bytes (a). Buffers are 
combined by offsetting the internal queue and de- 
queue pointers. 



the mailbox, of course, operates in an inverse manner. 

From the practical standpoint, the state of the mailbox 
is first examined by assertin g the C hip Select (CS), Read/ 
Write (R/W) and Control/Data (C/D) lines of the ap- 
propriate PAL device and monitoring the buffer's Full 
flag. An interrupt enable can then be written by bringing 
the R/W line low. The actual message may be transmit- 
ted from the processor to the mailbox by bringing the 
PAL'S CS and R/W lines low. 

Conversely, messages from the mailbox are sent to the 
receiving end by asserting CS and R/W of the appropri- 
ate PAL device, and bringing its C/D line low. The mail- 
box status is examined by asserting CS, R/W and C/D. 
The interrupt-enable bit can be written by bringing CS 
and C/D high, and R/W low. 

The mailbox, finally, can be extended to operate in a 
heterogeneous multiprocessing system. In that system, 
processes with both disparate data-block widths and 
clock frequencies are interconnected — an easy task for 
this FIFO buffer. 

SYNCHRONOUS OR ASYNCHRONOUS OPERATION 

The Am29338 operates as most FIFO buffers do in the 
asynchronous mode, as well as in the synchronous mode 
For the asynchronous mode, the Queue Clock input 
(QCLK) and DQCLK lines serve as strobes to queue or 
dequeue data and are generally independent of one anoth- 
er. As a result, the butTer can connect two asynchronous 
subsystems or to an asynchronous bus such as the 
VMEbus. 

In a synchronous system, however. Enable signals are 
easier to generate than strobes. Thus, the QCLK and 
DQCLK signals may be simply derived from the com- 
mon subsystem clock. Q ueueing and dequeu eing may 
then be ordered with t he Queue Enable (QEN) and De- 
queue Enable (DQEN) inputs. This technique makes it 
easy to interface the buffer to a single subsystem or syn- 
chronous bus, such as Multibus 11. 

As long as the FIFO buffer is neither full nor empty, 
the rates at which data floi^s in and out of the buffer are 
independent of each other. The user stays abreast of the 
chip buffers' states by means of four status indicators: 
Full, Almost Full (A-Full), Empty, and Almost Empty 
(A-Empty). This is the role of the byte-count output. 

Besides the basic flags such as Full and Empty for indi- 
cating chip state, the Am29338 supplies indicators to 
warn of the exact condition of its buffers. The A-Full and 
A-Empty outputs, for example, show that there are less 
than 4 bytes of space available, or more than 4 bytes of 
data in the buffer. These indicators, like Full and Empty, 
are valid only for synchronous operation. 

Finer control over the amount of data stored is possible 
with the 7-bit Byte Count output, which monitors ihe 
number of bytes currently in the buffer. Unlike the other , 
status indicators. Byte Count is valid only in the synchro- 
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nous mode. In asynchronous operation, Byte Count is 
undefined. 

An example of applying the Byte Count indicator is il- 
lustrated by its use in control tasks. For instance, various 
system devices may need some minimum amount of data 
on hand before a given function can be carried out. In this 
particular case, an external comparator informs the sys- 
tem that the required information is indeed in the buffer. 

In all operations, the chip is first initialized by bringing 
the Reset line low. In tasks like instruction-prefetch 
queues, asserting Reset flushes the queue when a jump or 
branch instruction occurs. This action discards any pre- 
fetched instructions. 

DATA-BIT MECHANICS 

The number of bytes to be queued into the buffer is set 
by means of the Bytes Queued (BQ) inputs, and the corre- 
sponding data is presented to the data (D)anddata parity 
(PD) inputs aligned to the least significant byte. When 
the QEN line is asserted, data will be entered on the fall- 
ing edge of the QCLK input. The device's internal point- 
ers will then be updated on the low-to-high transition of 
the clock. 

The number of bytes to be dequeued is determined by 
the Bytes De queued (BDQ) input. If the Dequeue Enable 
line (DQEN) is brought low, the state of the byte queue is 
updated and data is offloaded on the low-to-high transi- 
tion of the DQClXsjgnal^ 

When the Output Enable line (OE) goes low, the next 
four bytes available for unloading and their correspond- 
ing parity bits are brought out on the data output (Y) and 
data parity (PY) lines. When OE moves high, the D and 
PY pins assume a high-impedance state. 



As mentioned earlier, the chip relies on byte-wise pari- 
ty checking for error correction. Parity bits are checked 
at the input, stored with the data, and checked again at 
the output. Dual checking lends great flexibility to the er- 
ror-checking operation. In an task involving an instruc- 
tion-prefetch queue, for example, the designer may 
choose to check parity only at the output. Then, only exe- 
cuted instructions are checked. As a result, instructions 
that were prefetched but never used (such as those prefe- 
teched after a jump operation) will not cause spurious 
interrupts. 

In typical operation, the data input parity-error output 
(PDERR) will go high if any of the bytes being queued 
have a parity error. The output parity-error line 
(PYERR) goes high if any of the bytes on the output bus 
have a parity error. Only valid bytes are checked for data 
anomolies; bytes on the data-input bus which are not be- 
ing queued or undefined bytes which are sent out when 
the byte queue is almost empty are not included in the 
checking for errors. D 

Tim Olson, a senior planning engineer at Advanced Micro 
Devices, is in charge of developing microprocessor architec- 
tures and Am29300 family building blocks. Olson has a 
BSEE-computer science degree from the University of Col- 
oradoat Boulder and an MSEE from the University of Ari- 
zona at Tucson. 
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5. Circuitry for a simple hardware mailbox needs only one Am29338 FIFO bufier and two programmable-log- 
ic arrays for links to transmit and receive controllers. Three signal lines collectively check chip status and 
control information flow: CS. R/W. and C/D. A fourth line (IREQ) indicates interrupt requests. 
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6.10 DIGITAL SYSTEMS VME 29300-1 

Digital Systems offers the VME-29300-1 , an Art^300- 
Family-based CPU, designed for tfiose applications 
requiring the high performance of a 32-bit processor, 
intended for use in emulating othercomputers or special- 
purpose computing such as graphics, encoding/decod- 
ing, and data reduction, the processor can be supplied 
with or without firmware. Its key features are: 

• 100 ns per micro-instmction 

• 4K words of Writable-Control-Storage 

• 88-bit-wide microcode loaded from 27512 
EPROM. 

• On-board firmware address lights (single- 
stepping provided) 

» N-way branching up to 64 ways 

• 64 registers, 32 bits, 3-ported 

• Calculated register address to 16-way 

• Handles all seven interrupt levels 

• Under firmware control: A16/A24/A32 and D8/ 
D16/D32 

Introduction 

The VME-29300-1 CPU comes in a double-high two- 
board set. Both boards have PI and P2 connectors for 
backplane connections, and in addition, control lines are 
interconnected between boards using two ribbon cables. 
The Instruction Board contains the Am29331 Se- 
quencer, address read-out, microprogram memory, 
pipeline registers, and writable-control-storage circuitry. 

The Arithmetic Board contains the Am29332 ALU, the 
Am29334 Register File, the calculation registers and 
latches, the constants ROM, and the address and data 1/ 
O circuitry. Board positions and spacing within the VME 
rack can be customized. 

Am29331 — Microprogram Sequencer 

The Am29331 chip is configured as a 12-bit micropro- 
gram sequencer. The sequencer has multiway branch 
instructions that allow 1-of-N consecutive addresses to 
be selected as the branch target in a single cycle. The N- 
way branching can be chosen as 4-way, 8-way, 16-way, 
or 64-way by the microcode. Combinattons of M, A, and 
D input lines of the Am29331 are used for this choice. A 
stack within the sequencerstores retum addresses, toop 
addresses, and loop counts. It has 33 levels to permit the 
deep nesting of subroutines and toops. The lower 12 
output lines address the 4096-word microprogram 
memory, each word of which has a width of 88 bits. (The 
upper 4 address bits are not used.) Output data from the 
memory are fed to the pipeline registers. 



Writable-Control-Storage 

The Writable-Control-Storage (WCS) circuitry consists 
of a 275 1 2 EPROM and the associated circuitry to control 
loading. At power-on time, the toader brings the micro- 
program into the 4Kx88 random-access memory, step- 
ping the Am29331 sequencer through a series of ad- 
dresses. Then each word of the microprogram is 
checked back against the EPROM bit pattern. When this 
task is complete, the WCS loader is disabled and the 
sequencer takes control. For debugging purposes the 
microprogram can be single-stepped, and the WCS 
loader again controls the Am29331 sequencer. The 
address readout displays each address (in a readable 
fashion) during single-stepping. 

Am29334 — Register File 

The two Am29334 chips sen/e as a 64x32 external 
register file for the ALU. Each of these is a high-speed, 
random-access memory configured with one write port 
(D) and two read ports (A,B). The D port Is fed from the 
32-bit wide Y bus, while the A port feeds the MA bus and 
the B port feeds the CB bus. Control of write operations 
is done with the common write enable to each chip. This 
allows the tower-16 or upper-1 6 bits to be stored sepa- 
rately and gives the four different write options: 

• Write no data at all 

• Write only the lower 16 bits 

• Write only the upper 16 bits 

• Write all 32 bits simultaneously 

Read operations are corttrolled by a common output 
enable for reading all 32 bits to the A or B port. The A 
address bus originates in the writable control store 
(WCS) while the B and D address buses originate in the 
address calculation circuitry. By calculating the B and D 
addresses the CPU achieves a high degree of micropro- 
gram flexibility. 

Am29332— ALU 

The Arithmetic Logic Unit (ALU) processes 32-bit-wide 
data paths. This means that it allows one-, two-, three-, 
or four-byte data in arithmetic and tagic operations as 
well as multiprecision arithmetic and multiple-bit shift 
operations. The data flow uses two input buses, MA and 
CB, and one output bus, Y. Operation on data of variable 
byte length, variable-length bit fields, or even single bits 
is made possible by the internal mask generator. This 
circuit creates a 32-bit mask for each instruction while 
using no overhead time. The mask is used as an addi- 
tional operand in each instruction to allow operatbn on 
the selected data widths. Instmctions that operate on 
variable-length bit fields require a mask that is a contigu- 
ous string of 1 s for all selected bit positions and Os for all 
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unselected bit positions. In cases where the field ex- 
ceeds the 32-bit boundary, the masit does not wrap 
around, allowing operation on a contiguous field across 
a word boundary. 

For most single-operand Instmctions, the unselected bit 
positions pass the corresponding bits of the operand 
unmodified. For most two-operand instructions, the 
unselected bit positbns pass the corresponding bits of 
the operand unmodified on the CB input. Thus, for two- 
operand instructions the mask allows the merging of the 
two operands in a single cycle. In addition to being used 
internally, the mask can be sent out over the Y bus as a 
pattern for testing purposes. 

The Am29332 uses a funnel shifter with two 32-bit input 
ports and one 32-bit output port. This circuit can perform 
all of the operations of a barrel shifter (one N-bit input port 
and one N-bit output port) extended to two operands 
instead of one. Such a circuit is used to shift or rotate the 
operand up or down from to 32 bits in a single cycle. 
This is very useful in operations such as the normaliza- 
tion of a mantissa for floating-point arithmetic or in 
applications where the packing and unpacking of data 
are frequent operations. In addition, it can extract a 32-bit 
contiguous field across the two operands, a function 
which is very useful in some graphics applications. Also, 
any of its operations can be followed by a logical opera- 
tion with both completed in a single cycle. 



The Am29332 easily handles prioritization which is use- 
ful in controlling N-way branches, performing normaliza- 
tions, and in graphic operations such as polygon fills. The 
built-in priority encoder sends out a 5-bit binary weighted 
code that signifies the relative position of the most 
significant 1 of the byte width selected. This allows 
prroritization on either 8-, 1 6-, 24-, or 32-bit operands. 
The priority encoder output can be passed on to the Y bus 
or stored in the status register. 

The Complete VME-2g300-1 

The VME-29300-1 is a complete 32-blt processor when 
firmware is in place. It will operate on the VMEbus as a 
master or an interaipt-handler. Since it is not a fixed- 
instruction-set processor, firmware must be designed for 
proper operation. However, this is its outstanding advan- 
tage over other processors. Firmware options are almost 
limitless, giving the processor its high degree of adapta- 
bility to virtually any computing job. Chief among the 
suitable applications of this CPU is it ability to emulate 
other computing systems. This capability is not limited to 
32-bit processors, of course. Eight-bit and 1 6-bit systems 
are also easily emulated. Other complex computing jobs 
are also possible such as reducing large amounts of data 
and executing graphics programs. 

Digital Systems will design the firmware and deliver it 
with your system or provide design advice at an hourly 
rate by phone call or site visit. 
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12-bit Microprogram Sequencer 

• Provides 1 00-ns microcycle time to support 32-bit 
high performance system 

• Supports 4-way, 8-way, 16-way, and 64-way 
branching chosen by the microcode 

• Contains built-in conditional test logic for use with 
the ALU status bits 

• A 33-level stacl< provides support for loops and 
subroutine nesting 

• Supports single-stepping for the purpose of 
debugging 

• 12-bit address readout provided 
Microprogram Memory 

• Provides 4096-word capacity with a word width of 
88 bits of writable-control-storage 

• A 27512 EPROM allows customized firmware to 
be easily replaced or modified 

Register File 

• Two cascaded hiph-speed RAM chips for 64x32- 
bit register capacrty 

• Write control allows independent lower-1 6 or 
upper-16 bits of storage 

• Provides one WRITE port (D) and two READ 
ports (A, B) and four WRITE options 

• Calculated B and D addresses provide high 
degree of microprogram flexibility 



ALU 

• A combinatorial architecture with equal cycle time 
for all Instructions, two input ports, and one 
output port 

• Funnel shifter allows N-brt shift-up, shift-down, 
32-bit barrel shift or 32-bit field extract 

• Supports one-, two-, three-, and four-byte data 
for all operations and variable length fields for 
logical operations 

VME Ctiaracteristics 

• Double-high, two-board set occupies 4 slots 

• Power requirements: +5 VDC @ 3 A (max), +12 
VDC@0A,-12VDC@0A 

• Operating range: 0-70"C, 80% relative humidity, 
forced cooling required 

• Interrupt handler options: 1-7 

• Requester option: R(3) used 

• Master data transfer options: A1 6/A24/A32 and 
D8/D16/D32 

Additional information is available upon request from: 

Digital Systems Corporation 

3 North Main Street 

Walkersville,MD 21793 

(301)845-4141 
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7.1 THE Am29300/29C300 TIMING 
ANALYSIS 

With the Am29300, you can construct a system with a 
family cycle time of 80 ns orfaster. This is especially true 
with the Am29300A. This section discusses the various 
critical paths in determining the fastest family cycle time. 
The following systems configuration was assumed: 



Control Path 

Am29331/29C331 
Am29818A 
Am99C68 
Am27S55A 

Data Path 

Am29332/29C332 
Am29334/29C334 
Am29818A 



16-bit Microprogram Sequencer 
Pipeline Register 
Control Memory 
Registered PROM 



32-Bit ALU 

68 X 18 Dual Port Register File 

Status Register 



Non-Pipelined Operation 

The block diagrahi surrounding the Am29300/29C300 
family is shown in Figure 7-1 and its critical timing 
analysis is described in Tables 7-1 and 7-2. This timing 
analysis shows that a system cycle time of 75 ns is 
possible with the Am29300/29300A family, and 90 ns is 
possible with the Am29C300/29C300-1 family. The 
summary of the performance is listed in Table 7-5. 

Pipelined Operation 

With the two pipelined stages in the Am29C334 
(PIPE=HIGI-i), you can constmct the pipelined systems 
with the Am29C300. As an example for this operation, 
the following describes a double-pipelined system. In this 
example, the Am27S55A, the registered PROM is util- 
ized to improve the control path. Figure 7-2 shows an 
example of the pipelined system. 



Writing the Data into f/ie Register File 

It takes two cycles to write data into the register tile. In the 
first cycle, the data from the main memory is latched into 
the input pipeline register. Then in the second cycle, the 
data Is written into the RAM location in the Am29C334. 
(See cycle 1 -2 in Table 7-3.) 

Data Calculation and Storage 

In the first cycle, data (A1 ) to be operated upon is latched 
from the RAM location onto the output pipeline register of 
the Am29C334. In the second cycle, the operation is 
performed on the data (A1,B1) by the Am29C332. The 
result (CI) is then set up on the input pipeline register of 
the Am29C334. In the last cycle, the result is written into 
the RAM location of the Am29C334. For an example, 
refer to cycle 3-6 of Table 7-3. 

The second of the path cycles Is the most critical of the 
three. The maximum propagation delay incurred on this 
timing then has to be compared with the maximum 
control path timing. The cycle time Is determined by the 
longest of the two. The speed and choice of the main 
memory has to be based on the cycle time. 

It Is possible to time-share the above two operations. In 
other words, data can be written into the register file at 
the same time the operation is performed on the data 
from the register file. See Table 7-3 for an example. 

Table 7-4 shows the calculation of the pipelined 
Am29C300 system. As you notice, testing of the ALU 
status through the Am29C331 is critical for the control 
path, and the data path involving l-Y of the Am29C332 is 
also critical. The table shows that the data path deter- 
mines the cycle time. The result is shown in Table 7-5. 

It is quite possible to improve the cycle time further with 
combinations of the Am29300, Am29300A, Am29C300, 
and Am29C300-1 . 
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Figure 7-1. Am29300/29C300 System Timing Analysis 
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Table 7-1. Bipolar Am29300 Timing Analysis 



L^op 



Device 



Path 



Ani29300 



Am29300A' 



Am27S55A' 


Pipeline Reg. 


CP-Q 


10 


Am29331 


Sequencer 


D-Y 


19 


Am27S55A 


RPROM 


A-Q 


2Q 


Total: 






49 


Am27S55A 


Pipeline Reg. 


CP-Q 


10 


Am29331 


Sequencer 


l-Y 


25 


Am27S55A 


RPROM 


A-Q 


2Q 


Total: 






55 


Am29818A2 


Status Register 


CP-Q 


11 


Am29331 


Sequencer 


T-Y 


25 


Am27S55A 


RPROM 


A-Q 


2S. 


Total: 






56 


Am27S55A 


Pipeline Reg. 


CP-Q 


10 


Am29332 


ALU 


l-Y 


47 


Am29334 


Reg. File 


D-CP 


^ 


Total: 






66 


Am27S55A 


Pipeline Reg. 


CP-Q 


10 


Am29332 


ALU 


l-C,Z,N.L 


48 


Am29818A 


Status Reg. 


Y-CP 


J5 


Total: 






64 


Am27S55A 


Pipeline Reg. 


CP-Q 


10 


Am29334 


Reg. File 


A-Y 


24 


Am29332 


ALU 


D-C,2,N,L 


43 


Ani29818A 


Status Reg. 


D-CP 


-S 


Total: 






83 


Am27S55A 


Pipeline Reg. 


CP-Q 


10 


Am29334 


Reg. File 


A-Y 


24 


Am29332 


ALU 


D-Y 


35 


Am29334 


Reg. File 


D-CP 


_9 


Total: 






78 



10 
17 
20. 
47 

10 
22 
20. 
52 

11 
22 
20 
53 

10 
40 

_a 

59 

10 
41 

57 

10 
24 
37 

77 

10 
24 
30 
_2 
73 



Note: 1 . In tliis timing analysis, a registered PROM is used to store microcodes. WCS can be also implemented as 
replacement for the registered PROM. 

2. Tlie specifications can be improved by clioices of the pipeline registers. 

3. This is only applicable for the Am29331 A and the Am29332A. 
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Table 7-2. CMOS Am29C300 Timing Analysis (Non-pipelined Mode) 



Loop 



Device 



Path 



Am29C300 



Am29C300-1 



Am2981 8A^ 


Pipeline Reg. 


CP-Y 


11 


11 


Am29C331 


Sequencer 


D-Y 


22 


20 


Am99C68' 


WCS 


A-Y 


40 


40 


Am29818A 


Pipeline Reg. 


D-CP 


J 


3. 


Total: 






79 


77 


Am29818A 


Pipeline Reg. 


CP-Q 


11 


11 


Am29C331 


Sequencer 


i-Y 


24 


22 


Am99C68 


WCS 


A-Y 


40 


40 


Am29818A 


Pipeline Reg. 


D-CP 


J 


1 


Total: 






81 


79 


Am29818A 


Status Reg. 


CP-Q 


^ -4 


11 


Am29C331 


Sequencer 


T-Y 


24 


22 


Am99C68 


WCS 


AY 


40 


40 


Am2981 8A 


Pipeline Reg. 


D-CP 


3 


3 


Total: 






81 


79 


Am29818A 


Pipeline Reg. 


CP-Q 


11 


11 


Am29C332 


ALU 


i-Y 


66 


47 


Am29C334 


Reg. File 


D-CP 


1& 


m 


Total: 






92 


71 


Am29818A 


Pipeline Reg. 


CP-Q 


11 


11 


Am29C332 


ALU 


l-C,Z,N,L 


67 


48 


Am29818A 


Status Reg. 


Y-CP 


1 


e 


Total: 






84 


65 


Am29818A 


Pipeline Reg. 


CP-Q 


11 


11 


Arn29C334 


Reg. File 


A-Y 


32 


26 


Am29C332 


ALU 


D-C,Z,N,L 


60 


43 


Am29818A 


Status Reg. 


D-CP 


^ 


3 


Total: 






109 


86 


Am29818A 


Pipeline Reg. 


CP-Q 


11 


11 


Am29C334 


Reg. File 


A-Y 


32 


26 


Am29C332 


ALU 


D-Y 


49 


35 


AtTi29C334 


Reg. File 


D-CP 


15 


IS 


Total: 






107 


85 



Notes: 1 . WCS is used to store microcodes. Ttie registered PROM can be utilized as a replacement for the WCS. 

2. The specifications can be improved by choices of the pipeline register. 

3. An external register is used to store status output of the ALU. If the internal status register is used, the cycle 
time will be faster by eliminating the setup time of the external register. 
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Table 7-3. Pipelined Timing Sequence (Data Path) 



Cycle 



Am29C334 l/P 


A1' 


A2 


A3 


A4 


A5/C1 


A6/C2 


RAM (write) 




A1 


A2 


A3 


A4 


A5/G1 


RAM (read) 




A1/B12 


A2/B2 


A3/B3 


A4/B4 


A5/B5 


0/P 






A1/B1 


A2/B2 


A3/B3 


A4/B4 


Am29C332 ALU 








CI 


C2 


03 



Legend: l/P = Input Pipeline Register 

O/P = Output Pipeline Register 

Ci = Ai op Bi (op = Am29C332 Operation) 

Note: 1. For example, A1/B1 stands for (data derived from A port)/(data derived from B port). 
2. Assumption is made tliat data BI is already stored in the Am29C334. 




Figure 7-2. Blocic Diagram 
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Table 7-4. Pipelined Cycle Time Calculation 


Control Path 


Am29C300 Am29C300-1 


Data Path Am29C300 


Am29C300-1 


Am29818A 
Ani29C331 
Am27S55A 
Am29818A 
Total: 


CP-Q 
T-Y 

Add. Setup 
D-CP 


11 
24 
20 

61 


11 
22 
20 

_S 
59 


Am29818A CP-Q 11 
Am29C332 l-Y 66 
Am29C334 D-CP J£ 
Total: 92 


11 
47 
13 
71 


Table 7-5. Am29300/29C300 Family Cycle Time (ns) 




Am29300 


Am29300A 


Am29C300 Am29C300-1 


Non-Pipelined 
Pipelined 


83 
N/A 




77 
N/A 


109 86 
92 71 





7.2 THERMAL CHARACTERISTICS/ 
AIR FLOW 

DEFINITION OF THERMAL RESISTANCE 

The reliability of an integrated circuit is largely dependent on 
the maxiinum temperature which the device will attain during 
operation. Because the stability of a semiconductor junction 
declines with increasing temperature, knowledge of the ther- 
mal properties of the packaged device becomes an important 
factor during device design. In order to increase the operating 
lifetime of a given device, the junction temperatures must be 
minimized. This demands knowledge of the thermal resistance 
of the completed assembly and specification of the conditions 
in which the device will function properly. As devices become 
both smaller and more complex and the requirement for high 
speed operation becomes more important, heat dissipation 
will become an ever more critical parameter. 

Thermal resistance is defined as the temperature rise per unit 
power dissipation above some referenced condition. The unit 



of measure is typically °C/watt. The relationship between 
junction temperature and thermal resistance is given by: 

Tj = T, + Poejx (1) 

where: Tj = junction temperature 
Tx = reference temperature 
Po = power dissipation 
fljx = thermal resistance 
X = some defined test condition 

In general, one of three conditions is defined for measurement 
of thermal resistance: 

9jc -thermal resistance measured 

with reference to the tempera- 
ture at some specified point on 
the package surface. 

Bjf. -thermal resistance measured 

(still air) with respect to the temperature 

of a specified volume of still air. 

fij^ -thermal resistance measured 

(moving air) with respect to the temperature 
of air moving at a specified ve- 
locity. 

The relationship between Sjc and 9ja is 

9jA = SjC + ^CA 
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where Sca is a measure of the heat dissipation due to natural 
convection (still air) or forced convection (moving air) and the 
effect of heat radiation and mounting techniques. Bjq is 
dependent solely on material properties and package geome- 
try; dju includes the influence of the surface area of the 
package and environmental conditions. Each of these defini- 
tions of thermal resistance is an attempt to simulate some 
manner in which the package device may be used. 

The thermal resistance of a packaged device, however 
measured, is a summation of the thermal resistances of the 
individual components of the assembly. These in turn are 
functions of the thermal conductivity of the component mate- 
rials and the geometry of the heat flow paths. Like other 
material properties, thermal conductivity is usually tempera- 
ture dependent. For alumina and silicon, two common pack- 
age materials, this dependence can amount to a 30% 
variation in thermal conductivity over the operating tempera- 
ture range of the device. The thermal resistance of a compo- 
nent is given by 

L 
e = (2) 

where: L = length of the heat flow path 

A = cross sectional area of the heat flow path 
K(T) = thermal conductivity as a function of tem- 
perature 

and the overall thermal resistance of the assembly (discount- 
ing convective effects) will be: 

61 = 5:e„ = s — ^ 

but since the heat flow path through a component is influ- 
enced by the materials surrounding it, determination of L and 
A is not always straightforward. 

A second factor that affects the thermal resistance of a 
packaged device is the power dissipation level and, more 
particularly, the relationship between power level and die 
geometry, i.e., power distribution and power density. By 
rearrangement of equation 1 to 



Pd = — (Tj 



■"^'^i'"- 



Tx) 



(3) 



the relationship between P^ and Tjcan be more clearly seen. 
Thus, to dissipate a greater quantity of heat for a given 
geometry, Tj must increase and, since the individual d„ will 
also increase with temperature, the increase in Tj will not be a 
linear function of increasing power levels. 

A third factor of concern is the quality of the material 
interfaces. In terms of package construction, this relates 
specifically to the die attach bond, and for those packages 
having a heatsink, the heatsink attach bond. The quality of the 
die attach bond will most severely influence the package 
thermal resistance as this is the area which first impedes the 
transfer of heat out of the silicon die. Indeed, it seems likely 
that the initial thermal response of a powered device can be 
directly'related to the quality of the die attach bond. 



EXPERIMENTAL METHOD 

The technique for measurement of thermal resistance involves 
the identification of a temperature-sensitive parameter on the 
device and monitoring this parameter while the device is 
powered. For bipolar integrated circuits the forward voltage of 
the substrate isolation diode provides a convenient parameter 
to measure and has the advantage of a linear dependence on 
temperature. MOS devices which do not have an accessible 
substrate diode present greater measurement difficulties and 
may require simulation through use of a specially designed 
thermal test die. Choice of the parameter to be measured 
must be made with some care to ensure that the results of the 
measurement are truly representative of the thermal state of 
the device being investigated. Thus measurement of the 
substrate isolation diode which is generally diffused across the 
area of the die yields a weighted average of the condition of 
the individual junctions across the die surface. Measurement 
of a more local source would yield a less generalized result. 

For MOS devices, simulation is accomlished using the thermal 
test die. The basis for this test die is a 25 mil square cell 
containing an isolated diode and a 1 Kil resistor. The resistors 
are interconnected from cell to cell on the wafer before it is cut 
into mulitple arrays of the basic unit cell. In use the device is 
powered via the resistors with voltage or current adjusted for 
the proper level and the voltage drop of the individual diodes is 
monitored as in the case of actual devices. 

Prior to the thermal resistance test, the diode voltage/ 
temperature calibration must be determined. This is done by 
measuring the forward voltage at 1 mA current level at two 
different temperatures. The diode calibration factor is then: 



Kf 



T2-T, _AT 
V, - V, AV 



(4) 



in units of °C/mV. For most diodes used for this test the 
voltage/temperature relationship is linear and these two 
measurement points are sufficient to determine the calibration. 

The actual thermal resistance measurement has two alternat- 
ing phases: measurement and power on. The device under 
test is pulse powered with an ON duty cycle of 99% and a 
repetition rale of < 100 Hz. During the brief OFF states the 
device is reverse-biased with a 1 mA current and the voltage 
drop is measured. The series of voltage readings are averaged 
over short periods and compared to the voltage reading 
obtained before the device was first powered ON. The thermal 
resistance is then computed as: 



Kf(Vf-Vi) 

VhIh 



K,AV 



(5) 



where: Kp = calibration factor 

V, = initial fonward voltage value 

Vp = current forward voltage value 

Vh = heating voltage 

Ih = heating current 
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The pulsing measurement is continued until the device has 
reached thermal equilibrium and the final value measured is 
the equilibrium thermal resistance of the device under test. 

When the end result desired is flj^ {still air), the device and the 
test fixture (typically a standard burn-in socket) are enclosed in 
a box containing approximately 1 cubic foot of air. For 6jc 
measurements the device is attached to a large metal 



heatsink. This ensures that the reference point on the device 
surface is maintained at a constant temperature. The require- 
ments for measurement of flj* (moving air) are rather more 
complex and involve the use of a small wind tunnel with 
capability for monitoring air pressure, temperature and velocity 
in the area immediately surrounding the device tested. Stan- 
dardization of this last test requires much careful attention. 



WAVEFORMS FOR PULSED THERMAL RESISTANCE TEST 



VOLTAGE 




CURRENT 
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Table 7-6. Am29300 Thermal Resistance (°CA/V)' 





Am29325GC 


Am2933lGC 


Am29332GC 


Am29334GC 


BjA, Junction-to-Ambient, Still Air 


19.0 


21.8 


15.0 


22.0 


BjA, 200 Linear Feet per Minute 


7.0 


7.7 


8.0 


7.5 


9jA, 600 Linear Feet per Minute 


5.5 


5.1 


6.0 


5.0 


ejc, Junction-to-Case ^ 


2.5 


2.5 


2.5 


2.5 



Notes: 1 . The air flow should be measured at the vicinity of the heatsink. 

2. This Is the measured value based on a 144-pin PGA with heatsink 
attached. The value should not vary significantly over the family. 
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7.3 CMOS/BIPOLAR RELIABILITY 

Reliability Monitor 
Program 

AMD Specification 01-011 

The Reliability Monitor Program (RMP) is an extensive 
effort to measure the reliability of all process families at 
AMD on a regular basis. Typically 7,000 to 10,000 devices 
per month are tested in a variety of environmental stresses. 

The Reliability Monitor Program has tv*ro purposes: 

Improved Reliability Performance: Each reject found 
undergoes failure analysis. Results are used by AMD to 
identify and establish corrective actions to eliminate failure 
mechanisms. 

Generation of Reliability Data: Reliability results are 
utilized in many v/ays. Typical applications include assessing 
the benefits of burn-in, providing estimates of typical life- 
times, modeling field applications, and determining suita- 
bility of plastic and hermetic packaging in various 
temperature and humidity environments. This information 
is available to the customer. 

The stress tests employed are listed in Table 2: 
Table 2. Reliability Monitor Stress Conditions 



STRESS 


DURATION 


SAMPLE 
SIZE 


CONDIl 
HERMETIC 


"IONS 
PLASTIC 


Early 
Life 


1 60 hours 


300 


125''C 


125°C 
orSS'-C 


Operating 
Life 


1 000 hours 


120 


150°C 
and 125<'C 


125<'C 
or85°C 


Extended 
Operating 
Life (Biannual) 


2000 hours 


120 


150°C 
and 125°C 


125°C 
or85°C 


Temperature 
Cycle 


1 000 cycles 


50 


-^ss-c 

to 150°C 


-65°C 
to150°C 


Biased 
Temperature 
and Humidtty 


1 000 hours 


50 


N/A 


85°C& 
85% RH 
5v alt bias 


Pressure 
Cooker 


1 60 hours 


50 


N/A 


121''C, 
ISpsig, 
no bias 



The results from the Reliability Monitor Program form the 
basis of the failure rate calculations presented in the 
appendix. 



The Estimation of Field Reliability 

In this section, a modeling procedure is described for esti- 
mating reliability under field conditions, based on the 
lifetest data generated in the Reliability Monitor Program. 
The summaries of the lifetest results and the actual failure' 
rate projections are contained in the appendix. 



A General 
Reliability Model 

In order to evaluate the reliability of the product in the 
field, a general reliability model is utilized. The modeling 
procedure is described by authors Paul A. Tobias and 
David C.Trindode in the text Applied Reliability (New 
York: Van Nostrand Reinhold, 1986, pp. 173-182). 

The failure probability F(t) may be viewed as the proba- 
bility that a random unit drawn from the population fails 
by time t. Thus, F(t) may be represented in terms of a 
cumulative distribution function (CDF) of the times to 
failure. 

To understand the general reliability model it is useful to 
think of failures in terms of the three D's: dead, defective, 
or deficient. The general model encompasses (1) the dis- 
covery of functionally dead test escapes, (2) the defective 
subpopulations, and (3) the typical competing failure 
modes of the main population, which are typically indica- 
tive of design, material, or process deficiencies. 

The complete model for the field use CDF may be rep- 
resented as: 

FT = aFe-(-pFd + (l-a-p)FN, 
where Fg is the discovery distribution for the proportion a 
of test escapes, Fj is the life distribution for the proportion 
P of units in the defective subpopulations, and Fn is the 
life distribution derived from the N typical competing fail- 
ures modes. 

For Fn, the competing nature arises because a unit is 
viewed as a series system of different potential failure 
mechanisms such that the occurrence of any one failure 
mechanism results in failure of the unit. Thus, Fn = 1 — 
R1R2R3...RN, where R; is the reliability function for a spe- 
cific failure mechanism. For the series model, failure rates 
at any point in time are additive. 

The distribution for the test escapes is not an actual life dis- 
tribution, but describes the application dependent rate at 
which the escapes may be discovered in use. This category 
also includes good units damaged in test or handling. 
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Failure Distributions 

The lognormal and Weibull CDF's are the distributions 
most often used to represent reliability failure mechanisms. 
The exponential distribution, characterized by a constant 
failure rate, is a special case of the Weibull. The lognormal 
distribution is specified by two parameters: T50, the 
median time to failure, and sigma, the shape parameter. 
Similarly, the Weibull distribution, which can be written in 
closed form as F(t) = 1 — exp [— (t/c)""], is characterized by 
a characteristic life c and a shape parameter m. The value 
of the shape parameter determines whether the failure 
rate is increasing (m>l), decreasing (m<1), or constant 
(m=l).The exponential distribution, F(t) = 1 — exp [— (t/c)], 
is specified completely by the one parameter c called the 
mean time to failure (MTTF). Figures below show failure 
rates for several values of the scale parameters of the log- 
normal and Weibull distributions, respectively. 



Lognormal Failure Rate (Hazard) 




Weibull Failure Rate (Hazard) 



(Characteristic Life = 1) 




For the general reliability model to be applied, the distri- 
butions and associated parameters must be determined, 
either through reliability studies or a review of the relia- 
bility literature. In addition, if the experimentation is 
performed under accelerated conditions, acceleration 
models are needed to relate the results to field use. For 
distributions such as the lognormal or Weibull, accelera- 
tion factors are applied to the scale parameter (such as 
the median or characteristic life, respectively), in order to 
generate a new scale parameter from which failure rates 
at various field conditions may be estimated. Under true 
linear acceleration, the type of distribution and the shape 
parameter do not change between stress and field 
conditions. 



Calculation of 
Failure Rates 

To estimate field failure rates from reliability studies, many 
factors must be considered. One primary requirement is 
the identification of individual failure mechanisms in order 
to ascribe the failures to the proper categories used in the 
general reliability model. 

Considerations and Assumptions 

l.The fraction of test escapes and the underlying discov- 
ery distribution: 

The fraction of test escapes and contributions from dam- 
age occurring as a result of testing and handling proce- 
dures at the vendor or customer are estimable only from 
actual field usage, since the undeHying discovery distribu- 
tion is application dependent. To model these test escapes, 
a Weibull distribution with a decreasing failure rate may 
be used. In the appendix, test escapes, which represent an 
unknown eoriy adder to the model, are assumed negligi- 
ble. Temperature acceleration considerations do not apply 
to test escapes since the units are basically inoperative. 

2. The fraction of defective subpopulations and the under- 
lying distribution: 

The lifetimes for the fraction defective subpopulations may 
be modeled by the exponential distribution. Reliability 
results from stress testing must be carefully analyzed in 
order to identify the true defect related failure modes. 
From such studies at AMD, the mean time to failure (MTTF) 
for the defective subpopulations has been found to be 
approximately TOO hours at 125°C. The fraction p of 
product with defects is computed from the CDF estimate of 
defect related failures at readout time t by the following 
equation: 



p = CDF/(l-e-t/100). 
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To combine the results from lifetests at different tempera- 
tures or from dissimilar readout times, a pooled estimate 
of p may be calculated as the weighted mean of the indi- 
vidual p estimates. Sample size is the weighting factor. 
Based on the reliability literature, an activation energy of 
0.45 eV has been chosen as representative. 

3. The distributions of the competing failure mechanisms in 
the main population: 

Competing failure mechanisms may occur during either 
early fail or long term lifetesting. The distribution of life- 
times is modeled by a lognormal distribution with a sigma 
specific to each failure mechanism. The sigma value may 
be determined from the reliability literature and checked 
for reasonableness against values estimated from the 
data. Also from the reliability data giving the fraction 
failed for various mechanisms at stress readouts, the 
median time to fail (T50J at stress conditions may be 
estimated. To combine the results for a specific mechanism 
from several lifetests, a pooled median time to fail, 
weighted by sample size, is computed from the individual 
In T50 estimates. 

The acceleration factors specific to a failure mechanism 
may be applied to the pooled stress T50 to estimate the 
field T50. This field median life estimate may then be used 
with the same sigma to estimate the expected CDF in the 
field for a given mechanism at a chosen time. The individ- 
ual failure rates for each mechanism may be summed to 
arrive at the total device failure rate. 

4. The treatment of zero rejects for a possible failure 
mechanism: 

Just because failures for a given mechanism ore not 
observed does not mean such mechanisms are non- 
existent. The sample size may be insufficient or the accel- 
eration may be inadequate to reveal all possible low level 
reliability concerns. In fact, if the potential failure mecha- 



nisms have low thermal activation energies, the demon- 
stration of reliability performance may be limited by 
mechanisms with no observed failures! 

For example, time dependent dielectric breakdown 
(TDDB) for MOS devices has a lognormal distribution with 
sigma around 5.5 and activation energy of 0.3 eV. If no 
TDDB failures are observed in a HTOL stress, it is still pos- 
sible to calculate a non-zero, upper confidence level for 
the CDF based on the given sample size. The use of such 
a low activation energy may be a significant factor when 
combining failure rates across all possible mechanisms 
having higher activation energies. 

5. The incorporation of unknown failure mechanisms: 

Another significant factor in calculating failure rates is the 
manner in which unidentified mechanisms are incorpo- 
rated into the failure rate calculations. If the failure mech- 
anism is unknown, the rejects may be pooled into a 
category that uses fairly conservative activation energies 
of 0.3 eV for MOS and 0.5 eV for bipolar. Even though 
failure mechanisms are unidentified, it may still be possi- 
ble to estimate the lognormal sigmas from the data. 

6. Overall activation energies and the exponential 
distribution. 

In the reliability literature, it is common to see the use of 
overall activation energies, such as 0.7 eV for MOS and 
1.0 eV for bipolar technologies. In addition, the exponen- 
tial distribution is often assumed for all mechanisms. The 
use of an overall activation energy neglects those mech- 
anisms which are known to have lower activation energies 
and can result in estimates which ore impressively low but 
may be misleading. Furthermore, the use of the exponen- 
tial distribution for all cases may also result in inaccurate 
projections, since it is well established in the literature that 
most failure rate mechanisms have non-constant failure 
rates. 
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CMOS 



Channel Length: 1.5 )im Gate Oxide THokness: 2SqA 

Product Types Tested: static RAMs - Am99068, Am99C88 

Non-Volatils Memoiy Division - Am27C1024 

Microprocessor - Am29C10A 

Rxed Instruction Processor - Am82C288 



Metal Pilch: 4-6 )im 



Data Summary and Failure Rate Estimation for General Reliability Model 



Package Term 
Type of Model 



Failure 
Mechanism 



Test- Results 

168 hrs 1000 hra 

125°C 125°C 150' 



Reliability Modeling 
^ A Parameters 

(«V> @ 55° C 



Average Failure Rate (AFR) 

FITS @ 55X 

0-4khrs 4-30khrs 30-100khrs 



Hermetic 



Defective 

Subpopulations Cause not found 



Sample Size 

6,403 2,655 1,477 

Number of Rejects 



MTTF 
(hrs) 



Fraction 
Defective 
P (PPM) 



178 



Competing 
Mechanisms 



In(TSO) 



Conoded Metal 
Cracked Oxide 
Ionic Contamination 
Charge Gain/Loss 
Oxide Pinholes 
Cause not found 
Rejects 50% conf. 



0.50 
1.00 
1.00 
0.80 
0.30 
0.30 
0.30 



2.5 
9.0 

9.0 
9.0 
5.5 
5.5 
5.5 



18 


13 


39 


53 


44 


9 


2 


1 


45 


6 


1 


1 


44 


11 


2 


1 


28 


51 


22 


12 


27 


94 


37 


19 


28 


38 


17 


9 



Totals 



Competing 
Mechanisms 



Rejects 50% cent, p 
Totals 



Sample Size 
516 216 

Number of Rejects 



Sigma ln(T50) 
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Instantaneous Failure Rate at Field Conditions. 
Curves Derived from General Reliability Model. 
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TOTAL 



TIME (THOUSAND HOURS) 
— HERMETIC PLASTIC 



Traditional Method for Rellablity Projection 

Single Exponential Distribution Assumed E a = 07eV 
Stress Junction Temperature to Field Junction Temperature 



Package 
Type 


Slreu 


Sample 
Size 


Equivalent 

Device Houra 

at 55° C 


Reiecia 


Failure Rata 
(60% Confidence) 

(FITS) 


Hermetic 


16atn125°C 
1000 hn 12S°C 
10OOIws15O°C 


6,403 

2.6S5 

1,477 


83,841.423 
206,933,299 
384.626,879 


2 

4 
8 






Plaatie 


Totala 

168hre12S°C 
1000hrBl25'>C 


10,535 

516 
216 


675,401,600 

6,756,548 
16,835,251 


14 







23 




Telala 


732 


23,591,799 







39 



Package Related Tests 



Straae 


Package 

Type 


Sample 
Size 


Failure 
Mectianism 


Number of 
Relecta 


Percent 
Rejected 


Temperature 


Hermetic 
Plaatie 


150 
50 







0.00 


Cycle 
Preaaure Pot 


Totala 






0.00 
0.00 
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IMOXI! 

Cliannel Lenslh: WA Gab Oxide TNckiessiN/A 

Product Types Tested: Bipolar RAM- Am93422, Am93L412. Am93L422, Ain93L425 



Metal Pifch:4-7|un 



Field ProoraiKnabie Logic- AmPAL16H8, AmPAU6HD8, AmPAL16LB. AmPAL16lBL, AniPAL16R4, 

AmPAL16R4L, AmPAL16R6. AniPAL16R6L, AmPAL16R8, AmPAL16R81, AmPAL22V10 

Bipolar Prom- Am27S25, Ara27S29, Am27S31. Am27S33, Ain27S181, Ani27PS191. Am27S191 

Interface and Logic Products- Am29827. Am29828, Am29333. Am29841, Am29843, 
Am29844, Am29845,Am29853, Am29863. Am26LS14A 

Microprocessor- Am2901C, Am2910A, Am29705A 

Microcontioaer- Am291 16 

Peripheral Products- Am8177 

Data Summary and Failure Rate Estimation for General Reliability Model 





Failure 
Mechanism 


Test 

168 hrs 
125=0 


Results 


hrs 


Reliability Modeling 

E A Parameters 
(eV) @ 55« c 


Average Failure 
FITS (gi 


Rate (AFR) 
5S°C 


Type of Model 


125"C 


150°C 


0-4khrs 


4-30khrs 30-100khrs 




Damaged Metal 
Foreign Material Oxi< 
Wire Heel Broken 
Cause not Found 

Crystal Defects 
Cracked Oxide 
Rejects 50% conf 

Totals 


Sample Size 




0.45 
0.45 
0.45 
0.45 

0.70 
1.00 
0.50 


MTTF 
(hrs) 

848 
848 
348 
848 

Sigma 

9.0 
9.0 
4.0 


Fraction 
Defective 
3 (PPM) 

50 
151 

50 
101 

In(TSO) 

45 
46 
24 


13 
38 
13 
25 

5 
3 
8 







1 
1 

8 






Hermetic 

Defective 
Subpopulations 

Competing 
Mechanisms 


22,718 7,060 
Number of Re 

1 
de 3 

1 

2 

1 
1 

9 


5,709 
iects 











1 









1 


6 




1 








103 


10 




7 



Sample Size 

18,338 6,580 

Defective Number of Rejects 

Subpopulations Giassivation Damaged 1 

Damaged Ivietai 1 

Wire Clearance 1 

Cause not Found 3 



Fraction 
MTTF Defective 
(hrs) p (PPM) 



0.45 
0.45 
0.45 
0.45 



275 
275 
275 
275 



64 
64 
64 
191 



16 
16 
16 
48 



Competing 
Mecfianisms 



Ionic Contamination 
Rejects 50% conf. 



Sigma ln(T50) 



1.00 
0.50 



9.0 
4.0 



46 
23 



4 
44 



1 
34 




25 
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Instantaneous Failure Rate at Field Conditions. 
Curves Derived from General Reliability Model. 
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Traditional Method for Reliablity Projection 

Single Exponential Distribution Assumed E a ^ 1 -OeV 
Stress Junction Temperature to Field Junction Temperature 



Package 
Type Stress 



Sample 
Size 



Equivalent 

Device Hours 

at S5° C 



Fsllura Rate 
Rejects (60% Confidence) 
(FITS) 



168 hre 125°C 

1000hrs125°C 

lOOOhrslSCO 



22,718 
7.060 

_SJ22_ 



931.618,604 
1.724.695.417 
6.530.194,964 



Plastic 



168hrs125°C 
1000hrs125'C 



35.487 

18.338 
6.580 



9,186,508,985 



366,584,446 
740,419,943 





Totals 




24,918 1,109,004,389 7 


S 








Package Related 


Tests 




Slrasa 


Package 


Sample 


Failure Number of 


Percent 




Type 


Size 


Mecfianism 


Rejects 


Rejected 


Temperlure 












Cycle 


Hermetic 


2,849 


Lifted Metal 


4 


0.14 








Cracited Oxide 


3 


0.11 








Padiage Seal Cracks 


1 


0.04 








Package Seal VoMs 


1 


0.04 








Cause not found 


4 


0.14 




Totals 


13 


0.46 




Plaallc 


2,603 


Die Cracked 


1 


0.04 








Glassivation Cracked 


2 










Corroded Metal 


1 


04 








l^tal-Metai Sfiort 




0.04 








Cracked Oxide 


4 










Water In Package 


2 


0.08 








Wire Nock Broken 


1 


04 








Intermetailics 


5 


0,19 


Temperature 


Totals 


17 


0.65 


Humidity 


Plastic 


2,201 


Cause not found 


1 


0.05 



Pressure Pot Plastic 2.959 



Totals 

Die Cracked 
Corroded Leads 
Corroded Metal 

Totals 



0.05 

0.03 

0.03 

...P,P3., 

0.10 
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7.4 CMOS LATCH-UP TEST METHODS AND 
RESULTS 

Latch-up is a phenomenon that occurs when a parasitic 
PNPN structure on an IG chip is triggered and behaves 
like an SCR between the V^ and GND rails. Once 
initiated, the latch-up condition will persist until either the 
power supply is removed or the device Is destroyed. In 
virtually all cases, the device is destroyed because of the 
large current that can flow from the V;^ to the ground pin 
(the ON resistance of the SCR is very low). 

Interior modes of an IC could conceivably be prone to 
latch-up, but this intrinsically rare condition would be 
foundduring normal device testing and screening.Circu it 
nodes interfacing with the "outside world" are much more 
susceptible to latch-up because unusual transient condi- 
tions may occur - in particular, overshoot or ringing that 
pull the pin above the supply voltage or below GND. 

To induce latch-up, the conditions on these pins must 
meet two criteria: a) there must be sufficient voltage to 
forward bias-critical junctions in the SCR, and b) the 
available current must be in excess of the SCR trigger 
current. If these conditions exist, and if a suitable para- 
sitic PNPN structure is connected to that pin, latch-up will 
occur. 

Some thought must be given to the test values of voltage 
and current when determining susceptibility of a part to 
latch-up. Reasonabletest values would seemto be those 
experienced in an actual system under worst-case 
conditions. 

Most AMD devices are designed to wot1< with a nominal 
+5V supply. In such a system, voltage transients result- 



ing from transmission line effects, etc., will not exceed 
+5V in magnitude. Therefore, testing at a -i-l OV extreme 
(Vj^ plus 5V transient) and a -5V extreme (GND minus 
5V transient) will simulate a worst-case system environ- 
ment. 

Current levels for latch-up testing are governed by the 
maximum current available from any device in the sys- 
tem. The maximum drive capability of any output pin is 
approximately 100 mA; adding some margin to this, the 
test value becomes 300 mA. Any current derivedfrom the 
voltage transient magnitude divided by the transmission 
line impedance will be considerably less than this. 

Latch-Up Testing 

Testing was performed by forcing 300 mA into and out of 
each device pin, whether input or output, while monitor- 
ing l(^ for any indication of latch-up. The current sources 
were voltage-limited at +1 OV and -5V, perthe discussion 
above. The test configurations are shown in Figures 7-4 
and 7-5. 

Normal outputs were set to the HIGH state when current 
was forced into the pin (positive current) and set to the 
LOW state when the current was pulled out of the device 
(negative). Outputs with three-state capability were addi- 
tionally tested in the high-impedance state. 

The test results are summarized in Table 7-7. Forthe test 
limits indicated, no latch-up was induced for any pin of 
any part of any device type tested. 

Note that there was no positive current flow into the input 
pins since the inputs remained high-impedance up to the 
+10V clamp level. 



300 mA 




+5.5 V 




Figure 7-4. 



Figure 7-5. 
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Table 7-7. CMOS Latch-Up Testing Summary 

(Am29C01, Am29C10A, Am29C101) 



Tested Pin Test Figure 


Max II (mA) 


Max VI (V) 


Latch-Up 


Inputs 


1 





+ 10 


No 




2 


-18 


-5 


No 


Normal 


1 


+300 


+6.5 


Nb 


Outputs 


2 


-300 


-1.4 


No 


Three-State 


1 


+300 


+6.6 


No 


Outputs (active) 


2 


-300 


-1.8 


No 


Three-Stale 


1 


+300 


+ 10 


No 


Outputs{High-Z) 


2 


-300 


-1.8 


No 



7.5 TEST PHILOSOPHY AND METHODS 

The following nine points describe AMD's philosophy 
for high volume, high speed automatic testing. 

1 . Ensure that the part is adequately decoupled at the 
test head. Large changes in V^c current as the device 
switches may cause erroneous function failures due to 
V(^ changes. 

2. Do not leave inputs floating during any tests, as they 
may start to oscillate at high frequency. 

3. Do not attempt to perform threshold tests at high 
speed. Following an output transition, ground current 
may change by as much as 400 mA in 5-8 ns. 
Inductance in the ground cable may allow the ground 
pin at the device to rise by hundreds of millivolts 
momentarily. 

4. Use extreme care in defining point input levels for AC 
tests. Many inputs may be changed at once, so there 
will be significant noise at the device pins and they may 
not actually reach V,l or Vi^ until the noise has settled. 
AMD recommends using Vn^ < V and V„ > 3.0 V for 
AC tests. 

5. To simplify failure analysis, programs should be de- 
signed to perform DC, Function, and AC tests as three 
distinct groups of tests. 



6. Capacitive Loading for AC Testing 

Automatic testers and their associated hardware have 
stray capacitance that varies from one type of tester to 
another but is generally around 50 pF. This, of course, 
mal<es it impossible to make direct measurements of 
parameters which call for smaller capacitive load than 
the associated stray capacitance. Typical examples of 
this are the so-called "float delays," which measure the 
propagation delays into the high-impedance state and 
are usually specified at a load capacitance of 5.0 pF. 
In these cases, the test is performed at the higher load 
capacitance (typlcaily 50 pF) and engineering correla- 
tions based on data fallen with a bench setup are used 
to predict the result at the lower capacitance. 

Similarly, a product may be specified at more than one 
capacitive load. Since the typical automatic tester is 
not capable of switching loads in mid-test, it is impos- 
sible to make measurements at both capacitances 
even though they may both be greater than the stray 
capacitance. In these cases, a measurement is made 
at one of the two capacitances. The result at the other 
capacitance is predicted from engineering correla- 
tbns based on data taken with a bench setup and the 
knowledge that certain DC measurements (l^^, l^^ for 
example) have already been taken and are within 
spec. In some cases, special DC tests are performed 
in order to facilitate this correlation. 
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7. Threshold Testing 

The noise associated with automatic testing (due to 
the long, inductive cables) and the high gain of the 
tested device when in the vicinity of the actual device 
threshold, frequently give rise to oscillations when 
testing high speed circuits. These oscillations are not 
indicative of a reject device, but instead of an over- 
taxed test system. To minimize this problem, thresh- 
olds are tested at least once for each input pin. There- 
after, "hard" high and low levels are used for other 
tests. Generally this means that function and AC 
testing are performed at "hard" input levels ratherthan 
at V|L Max. and V|„ Min. 

8. AC Testing 

Occasionally, parameters are specified that cannot 
be measured directly on automatic testers because of 
tester limitations. Data input hold times often fall into 
this category. In these cases, the parameter in ques- 
tion is guaranteed by correlating these tests with other 
AC tests that have been performed. These correla- 



tions are anived at by the cognizant engineer by using 
precise bench measurements in conjunction with the 
knowledge that certain DC parameters have already 
been measured and are within spec. 

In some cases, certain AC tests are redundant, since 
they can be shown to be predicted by some other 
tests which have already been performed. In these 
cases, the redundant tests are not performed. 

9. Output Short-Circuit Current Testing 

When perfonning l^g tests on devices containing RAM 
or registers, great care must be taken that undershoot 
caused by grounding the high-state output does not 
trigger parasitic elements which in turn cause the 
device to change state. In order to avoid this effect, it 
is common to make the measurement at a voltage 
(^output) "13* 'S slightly above ground. The V^,;, is 
raised by the same amount so that the result (as 
confirmed by Ohm's law and precise bench testing) is 



identical to the V^^j = 0, 



: Max. case. 
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8.1 PHYSICAL DIMENSIONS* 
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Plastic Leaded Chip Carrier (PC) 
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NOTE: Package dimensions are given in inches. To convert to millimeters, multiply by 25.4. 
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Ceramic Pin-Grid-Array Pac!<ages (CG/CGX) 
CGX120 



BOTTOM VIEW 



.075 X 45° REF, 
(REFERENCE CORNER)" 



1.340 
1.380 



1.200 
BSC 



1.380 
-1.200BSC- 



A B C D E F 



H J K L M I 



® ® © © 

©(SO® 
® ® ® 
® ® © 
© ® ® 

I- ©-©-©- 

® ® © 
® ® ® 
® ® ® 

® ® o ® 

® ® ® ® 
[ ® ® ® g 



® ® ® ® 
© ® ® ® 
® © © © 



I 



^ 



.030x45° REF. 
(3 PLACES) 




® ® ® ® () 
® ® ® © © 

© ® o © ® 
© © © 
® ® ® 
® ® ® 
-®-®-©- 
® © ® 
© ® ® 
® ® ® 

o ® ® 
® ® ® 

© ® (i d) ® 



.060 
.080 




.100 
.200 



CG120 



BOTTOM VIEW 



.075x45° REF. 
(REFERENCE CORNER)" 



1.340 
1.380 



1.200 
BSC 



BODE 



3 L © 



7^ 



® ® ® ® 
® ® ® ® ® 
® ® O ® ® 
® ® ® 
® ® ® 
® ® © 
®-®-^S^ 
© ® ® 
® ® ® 
® ® ® 
® ® O © ® 
® © ® ® ® 

© ® ® ® 



.030x45° REF. 
(3 PLACES) 





NOTE: Package dimensions are given in inches. To convert to millimeters, multiply by 25.4 . 



8-3 



CHAPTER 8 
General Information 



Ceramic Pin-Grid-Array Packages (CG/CGX) (Continued) 
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NOTE: Package dimensions are given in inches. To convert to rtiillimeters, multiply by 25.4. 
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Ceramic Pin-Grid-Array Packages (CG/CGX) (Continued) 
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NOTE: Package dimensions are given in inches. To convert to millimeters, multipl/ by 25.4. 
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Ceramic Pin-Grid-Array Pacloges (CG/CGX) (Continued) 
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NOTE: Package dimensions are given in Indies. To convert to millimeters, multiply by 25.4. 
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Ceramic Pin-Grid-Array Packages (CG/CGX) (Continued) 
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NOTE: Package dimensions are given in inches. To convert to millimeters, multiply by 25.4. 
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8.2 ORDERING INFORMATION 

All Advanced Micro Devices' products listed are stocked locally and distributed nationally by Franchised Distributors. 
See back of this book for the location nearest you. Please consult them for the latest price revisions. For direct factory 
orders, call your local AMD Sales Office or Sales Representative. See the back of this book for the location nearest 
you. 

Minimum Order 

The minimum direct factory order is $100.00 for a standard product. The minimum direct factory order for burn-in 
product is $250.00. 

Product Ordering, Pacl<age and Temperature Range Codes 

The following scheme is used to identify Advanced Micro Devices' Standard products: 

Ain29334 G C B 

T 



Device Number 



Package Type 



Optional 
Processing 

Temperature 
Range 



Package Type 

P = Plastic DIP 

D= Ceramic DIP 

G= Pin Grid Array 

J = Plastic Leaded Chip Carrier 



Temperature Range 

C = Commercial 
(0 to +70°C) 



Optional Processing 

Blank = Standard Processing 
B = Burn-in 



The following scheme is used to identify Advanced Micro Devices' Military (APL) products: 

Am29C334 /B Z C 

"L 



Device Number 



Device Class ■ 



Lead Finish 



Package Type 



Device Class 

/B = Class B 



Package Type 

X= DIP Packages 
Z = All Other Configurations 
(PGAs, etc.) 



Lead Finish 

C = Gold 



8-8 



NOTES 



NOTES 



ADVANCED MiCRO DEVICES' NORTH AMERICAN SALES OFFICES 



ALABAMA (205) 882-9122 

ARIZONA (602) 242^1400 

CALIFORNIA, 

Culver City (213) 645-1524 

Newport Beach (714) 752-6262 

San Diego (619)560-7030 

San Jose (408) 249-7766 

Santa Clara (408) 727-3270 

Woodland Hills (818) 992-4155 

CANADA, Ontario, 

Kanata (613) 592-0060 

Willowdale (416) 224-5193 

COLORADO (303) 741-2900 

CONNECTICUT (203) 264-7800 

FLORIDA, 

Clearaiater (813) 530-9971 

Ft Uuderdale (305) 776-2001 

H4elboume (305) 729-0496 

Orlando (305) 859-0831 

GEORGIA (404) 449-7920 

ILLINOIS, 

Chicago (312) 773-4422 

Napenrille (312) 505-9517 

INDIANA (317) 244-7207 



KANSAS (913)451-3115 

MARYLAND (301) 796-9310 

MASSACHUSETTS (617) 273-3970 

MINNESOTA (612) 938-0001 

MISSOURI (314) 275-4415 

NEW JERSEY (201) 299-0002 

NEW YORK, 

Liverpool (315) 457-5400 

Poughkeepsie (914) 471-8180 

Woodbury (516) 364-3020 

NORTH CAROLINA (919) 847-8471 

OHIO (614) 891-6455 

Columbus (614) 891-6455 

Dayton 513) 439-0470 

OREGON (503) 245-0080 

PENNSYLVANIA, 

Allentown (215) 393-3006 

Willow Grove (215) 657-3101 

TEXAS, 

Austin (512) 346-7830 

Dallas (214)934-9099 

Houston 713) 785-9001 

WASHINGTON (206) 455-3600 

WISCONSIN (414) 792-0590 



ADVANCED MICRO DEVICES' INTERNATIONAL SALES OFFICES 



BELGIUM, 
Bruxelles 



FRANCE, 
Paris . . 



.TEL (02) 771 91 42 

FAX (02)762 37 12 

TLX 61028 



KOREA, Seoul 



GERMANY, 
Hannover area . 



. TEL . 
FAX . 
TLX 



(01)49-75-10-10 

(01)49-75-10-13 

263282 



Mijnchen 
Stuttgart . 



HONG KONG, 
Kowloon 



ITALY, Milano 



JAPAN, 
Tokyo 



.TEL (05143)50 55 

FAX (05143) 55 53 

TLX 925287 

.TEL (089) 41 14-0 

FAX (089) 406490 

TLX 523833 

. TEL (0711)62 33 77 

RAX (0711)625187 

TLX 721882 

.TEL 852-3-695377 

RAX 1234276 

TLX . . 504260AMDAPHX 

.TEL (02) 3390541 

(02) 3533241 

FAX (02) 3498000 

TLX 315286 

. TEL (03) 345-8241 

FAX 3425196 

TLX . . . J24064AMDTKOJ 

TEL 06-243-3250 

FAX 06-243-3253 



LATIN AMERICA, 
Ft. Lauderdale . 



NORWAY 
Hovik . . 



SWEDEN, Stockholm 



UNITED KINGDOM, 
Farnborough .... 



Manchester area . 
London area . . . . 



TEL 82-2-784-7598 

FAX 82-2-784-8014 

.TEL (305) 484-8600 

FAX (305) 485-9736 

TLX .. 5109554261 AMDFTL 

TEL (47)2 537810 

FAX (47)2 591959 

TLX 79079 

.TEL 65-2257544 

FAX 2246113 

TLX RS55650 MMI RS 

.TEL (08) 733 03 50 

FAX (08) 733 22 85 

TLX 11602 

. TLX 886-2-7122066 

FAX 886-2-7122017 

. TEL (0252)517431 

FAX (44252)521041 

TLX 858051 

-TEL (0925) 828008 

FAX (0925) 827693 

TLX 628524 

. TEL (04862) 22121 

FAX (04862) 22179 

TLX 859103 



NORTH AMERICAN REPRESENTATIVES 



CALIFORNIA 

1= INC OEM (408) 988-3400 

DISTI (408) 498-6868 
CANADA 
Calgary, Alberta 

VITEL ELECTRONICS (403) 278-5833 

Kanata, Ontario 

VITEL ELECTRONICS (613)592-0090 

MIssissauga, Ontario 

VITAL ELECTRONICS (416) 676-9720 

Quebec 

VITEL ELECTRONICS (514)636-5951 

IDAHO 

INTERMOUNTAIN TECH MKGT (208) 888-6071 

INDIANA 

SAI MARKETING CORP (317) 253-1668 

IOWA 

LORENZ SALES (319) 377-4666 

KANSAS 

LORENZ SALES (913) 384-6556 



MICHIGAN 

SAI MARKETING CORP (313) 750-1922 

MISSOURI 

LORENZ SALES (314) 997-4558 

NEBRASKA 

LORENZ SALES (402) 475-4660 

NEW MEXICO 

THORSON DESERT STATES (505) 293-8655 

NEW YORK 

NYCOM, INC (315) 437-8343 

OHIO 
Columbus 

DOLFUSS ROOT & CO (614)885-4844 

Dayton 

DOLFUSS ROOT & CO (513) 433-6776 

Strongsville 

DOLFUSS ROOT & CO (216)238^)300 

PENNSYLVANIA 

DOLFUSS ROOT & CO (412)221-4420 

UTAH 

R^ MARKETING (801) 595-0631 



Advanced Micro Devices reserves the nght to make changes in its product without notice in order to improve design or performance 
characteristics. The performance characteristics listed in this document are guaranteed by specific tests, guard banding, design and 
other practices common to the industry. For specific testing details, contact your local AMD sales representative. The company 
assumes no responsibility for the use of any circuits described herein. 
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TEL (408) 732-2400 • TWX: 910-339-9280 • TELEX: 34-6306 • TOLL FREE; (800) 538-8450 
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